davidcaron / pye57

Read and write e57 point clouds from Python
MIT License
71 stars 42 forks source link

Add allowParallel option #63

Open nh2 opened 5 months ago

nh2 commented 5 months ago

Draft PR for in-review libE57Format feature to allow parallel read access to the same E57 handle, e.g. with Python's ThreadPoolExecutor: https://github.com/asmaloney/libE57Format/pull/292

TODO

dancergraham commented 5 months ago

Looks interesting! It would be cool to have a unit test for this and as noted it is important to remove the submodule change before merging. What are the performance implications?

dancergraham commented 5 months ago

There may also be some work required to bring pye57 up to date with other changes in Libe57format, particularly if this lands as part of a major revision.

nh2 commented 5 months ago

What are the performance implications?

The only one we know of is that you can now use multiple threads for reading data out of the same E57.

Which can produce big speedups when being IO-bound by a single thread (e.g. when on a striped RAID or network file system, reading with N threads can be up to N times faster).

(Of course one could also open the same E57 multiple times to get parallel read access, but it's less convenient because there are more open file handles, and one needs to write the code in a coarser way instead of being able to use it locally in a function which has an already-opened E57.)

dancergraham commented 1 month ago

free threading is available as a build flag on python 3.13 - I wonder how that would interact with this feature?