DeepLearnPhysics / larcv3

Third version of larcv. This is a complete replacement for larcv2.
MIT License
11 stars 6 forks source link

Mpi io #29

Closed coreyjadams closed 5 years ago

coreyjadams commented 5 years ago

Initial pull request to validate QueueProcessor. QueueProcessor is an alternative to ThreadProcessor that is entirely deterministic: users can control which events are read, and when, and can determine when to promote data from next to current. It's performance as compared to ThreadProcessor with 1 thread, 1 batch storage is on par. Thread Processor out performs when the number of batch storage is greater than 1, running about 10% faster on a single node.

However, QueueProcessor is designed to be much more compatible with scaling out to MPI read only IO, allowing fine grained control over synchronizing IO calls.

Some MPI work is already done but more is coming. Initiating this pull request now allows tracking the test suite, though tests for QueueProcessor don't exist yet. They'll come before merging.

There are some other optimizations here in batch_pydata and larcv3::BatchData. I moved the conversion from std::vector to numpy array to be callable from BatchData which allows much less python overhead (giving a 5x speedup on mac!).

coreyjadams commented 5 years ago

There are never any reviewers available ... Sigh. Anyways, I have set the default at the time of this comment to have mpi OFF, openmp OFF, but the speedup gained by allowing batch_data to directly convert to numpy is big enough to merge this branch to develop at an intermediate state. So I'll merge.