google / ripunzip

Other
161 stars 17 forks source link

positioned_io draft to prevent expensive Arc<Mutex<File>> #34

Open muja opened 1 year ago

muja commented 1 year ago

Hi, I've noticed you're using Arc<Mutex>, which seems like a bottleneck for heavy parallellization. However a preliminary bench didn't yield as much improvement as I'd hoped, however could be due to the fact that those aren't utilizing threads as aggressively.

Anyway here is the code. Note that this is completely unrefined, I was using Box::leak to get it to compile. I don't have a real use case for this other than I was playing around a bit, so if you have no interest in this, that's fine for me.

For a quick summary, you can read the positioned_io description. Basically it allows parallel reads on a &File (note the missing mut) without changing the file pos. It's also used by rc-zip, which allows extracting in parallel. however it is mainly unmaintained and the underlying library seems to have UB, but investigating into this direction might be worth it.

google-cla[bot] commented 1 year ago

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

adetaylor commented 1 year ago

Thanks! I wasn't aware of positioned_io and I'm all in favor of reusing existing crates, so perhaps I'll have a crack at this sometime.