openforcefield / openff-benchmark

Comparison benchmarks between public force fields and Open Force Field Initiative force fields
MIT License
11 stars 2 forks source link

`openff-benchmark optimize export` is very slow; should be parallelized #79

Open dotsdl opened 3 years ago

dotsdl commented 3 years ago

For a dataset with many thousands of optimizations using a QCFractal server approach, export is currently done in series with much of the time spent serializing large JSON structures and writing to the filesystem. This should be entirely parallelizable, and is currently very slow.

To accomplish this, the openff.benchmark.geometry_optimization.compute.OptimizationExecutor.export_molecule_data method's innermost loop should be broken into a standalone staticmethod, then passed to a multiprocessing.ProcessPool executor with the optimization object and its id. The size of the ProcessPool should be configurable with a parameter and commandline flag.