molmod / psiflow

scalable molecular simulation
https://molmod.github.io/psiflow/
MIT License
121 stars 7 forks source link

Torch multiprocessing interference #2

Closed svandenhaute closed 1 year ago

svandenhaute commented 1 year ago

The current NequIP interface does not like Parsl and its use of Python and Torch's multiprocessing, and ends up deadlocking in many cases, even when NEQUIP_NUM_TASKS is set to 1. I've tried a variety of ad hoc fixes but none of them work consistently. The weird part is that it never does it in the testing environment, i.e. when called via pytest.

The issue is most likely circumvented by reimplementing the interface with a bash_app, and that's what I'll do sometime in the next two days. For the moment, use MACE, as that interface does not seem to suffer from the same problem.