mosdef-hub / mbuild

A hierarchical, component based molecule builder
https://mbuild.mosdef.org
Other
171 stars 80 forks source link

Improving performance in the gsd writer #1133

Closed chemicalfiend closed 1 year ago

chemicalfiend commented 1 year ago

mbuild's gsd writer right now is pretty slow for very large systems. Systems with 10s of thousands of atoms takes several hours to write to a gsd file. For complex structures such as the ones I had already mentioned in a previous issue (#1114) which seem to be reasonably common use cases, the gsd writer setup is slow on CPUs.

What are the possible speedups?

daico007 commented 1 year ago

I think I may have a clue about what's happening, can you try out this PR #1134 and see if your systems get written out faster?

chemicalfiend commented 1 year ago

Doesn't seem to have much of an effect. I ran the same code before and after pulling the changes from the PR. The bottom cell where I timed the gsd write should indicate how long it took.

daico007 commented 1 year ago

Got it. I did some profiling locally, and it seems the slow down actually has to do with the Compound.particles() iteration, I will try and see if I can change the logic there and speed thing up.

(This is a box of 1000 ethane, so 8000 particles and 7000 bonds)

Screenshot 2023-07-05 at 11 43 42
daico007 commented 1 year ago

I think I have this figure out in #1135, @chemicalfiend, please see the screenshot below and the PR itself for more details. I will have the release with this change soon, so I will be out on conda-forge (either tomorrow or early next week)

Screenshot 2023-07-08 at 00 37 59
chemicalfiend commented 1 year ago

Just repeated the before and after comparison. The difference is unbelievable! Thank you very much. This is going to speed up my workflows a lot.

Before PR#1135 Sat Jul  8 01:53:22 PM IST 2023

After PR#1135 Sat Jul  8 01:53:29 PM IST 2023

:tada: closing the issue. Thanks again.