pydata / numexpr

Fast numerical array expression evaluator for Python, NumPy, Pandas, PyTables and more
https://numexpr.readthedocs.io/en/latest/user_guide.html
MIT License
2.23k stars 210 forks source link

Calling `platform.machine()` causing MPI problems #399

Closed dwpaley closed 2 years ago

dwpaley commented 2 years ago

Hi,

On large MPI jobs at NERSC we have found that anything spawning a subprocess can cause unpredictable hangs/deadlocks in global communication steps. We identified this line as a problem, since platform.machine() internally ends up using subprocess:

https://github.com/pydata/numexpr/blob/1c6a024947c3aa1bf926ecb9828036b306d7c6d7/numexpr/utils.py#L126

That code is executed on importing pandas if numexpr is also present. There's some discussion here: https://github.com/dials/dials/issues/1998 The CPython module uuid is also an offender but easier to work around in our use case.

Could the architecture check be done at install time? I'll open a PR.

Thanks, Dan

robbmcleod commented 2 years ago

I think that's reasonable but I don't have, otherwise, a pressing need to make a release. Has this been reported to the Python core people?

ndevenish commented 2 years ago

For reference, since I ended up following this down a rabbit hole, it looks like this behaviour was fixed in Python 3.9: https://github.com/python/cpython/commit/518835f3354d6672e61c9f52348c1e4a2533ea00

dwpaley commented 2 years ago

Thanks, @robbmcleod. As @ndevenish says, platform.machine() and platform.system() no longer spawn subprocesses on py39. We've also solved our immediate problem via cctbx/cctbx_project#731. I would say this doesn't urgently need to get into a release.

robbmcleod commented 2 years ago

Resolved as best it can be in release 2.8.3.