recursionpharma / gflownet

GFlowNet library specialized for graph & molecular data
MIT License
216 stars 43 forks source link

Shared pinned buffers #120

Open bengioe opened 9 months ago

bengioe commented 9 months ago

This PR implements a better way of sharing torch tensors between process by creating (large enough) shared tensors that are created once are used as a transfer mechanism. Doing this on the fragment environment seh_frag.py I'm getting a 30% wall time improvement for simple settings, with batch size 64 (I'm sure we could have fun maxing that out and see how far we can take GPU utilization).

Some notes:

Other changes:

Note, EnvelopeQL is still in a broken state, will fix in #127

bengioe commented 9 months ago

I'm of a mind to merge this actually. It's not the cleanest implementation possible but there are significant gains here (as mentioned, a 30% speedup with the default settings on seh_frag.py). Will test across tasks and report back.

bengioe commented 8 months ago

Made significant simplifications to the method by subclassing Pickler/Unpickler, found some very tricky bugs (I was making a bad usage of pinned CUDA buffers and ended up with rare race conditions). Speedups remain (might even be a bit faster).

bengioe commented 6 months ago

Merged with trunk + made a few fixes. Pretty happy with this now!