uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.38k stars 89 forks source link

Using the process pool leads to error "ImportError: sys.meta_path is None, Python is likely shutting down" at exit #244

Open bluenote10 opened 2 years ago

bluenote10 commented 2 years ago

The following example works, but it leads to an ugly error message when the main process exits:

import pathos

with pathos.pools.ProcessPool(4) as pool:
  ...

Output

Exception ignored in: <function Pool.__del__ at 0x7f8c7e23d160>
Traceback (most recent call last):
  File "/home/me/some_venv/lib/python3.8/site-packages/multiprocess/pool.py", line 268, in __del__
  File "/home/me/some_venv/lib/python3.8/site-packages/multiprocess/queues.py", line 365, in put
  File "/home/me/some_venv/lib/python3.8/site-packages/multiprocess/reduction.py", line 54, in dumps
  File "/home/me/some_venv/lib/python3.8/site-packages/multiprocess/reduction.py", line 42, in __init__
  File "/home/me/some_venv/lib/python3.8/site-packages/dill/_dill.py", line 573, in __init__
ImportError: sys.meta_path is None, Python is likely shutting down

The process return code is 0 though, i.e., it isn't a "real" error/crash.

The involved libraries should be relatively up-to-date.

pathos==0.2.9
dill==0.3.5.1
havardox commented 2 years ago

Happens to me too. This simple program gives me the exact same error:

import pathos

with pathos.pools.ProcessPool(4) as pool:
    print("hello world")
hello world
Exception ignored in: <function Pool.__del__ at 0x0000018B6EFE9AB0>
Traceback (most recent call last):
  File "C:\Users\harald\AppData\Local\pypoetry\Cache\virtualenvs\pcpartfinder-sLasAYUh-py3.10\lib\site-packages\multiprocess\pool.py", line 268, in __del__
  File "C:\Users\harald\AppData\Local\pypoetry\Cache\virtualenvs\pcpartfinder-sLasAYUh-py3.10\lib\site-packages\multiprocess\queues.py", line 375, 
in put
  File "C:\Users\harald\AppData\Local\pypoetry\Cache\virtualenvs\pcpartfinder-sLasAYUh-py3.10\lib\site-packages\multiprocess\reduction.py", line 54, in dumps
  File "C:\Users\harald\AppData\Local\pypoetry\Cache\virtualenvs\pcpartfinder-sLasAYUh-py3.10\lib\site-packages\multiprocess\reduction.py", line 42, in __init__
  File "C:\Users\harald\AppData\Local\pypoetry\Cache\virtualenvs\pcpartfinder-sLasAYUh-py3.10\lib\site-packages\dill\_dill.py", line 573, in __init__
ImportError: sys.meta_path is None, Python is likely shutting down

Running Windows 10, pathos 0.2.9, Python 3.10.6

xiay-lcw commented 2 years ago

Same here for

Python 3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0] on linux

pathos 0.2.9

Ubuntu 20.04.5 LTS

xiay-lcw commented 2 years ago

Adding manual pool.close() call before process exits seems to solve the issue. Seems like a python resource clean up order thing.

mmckerns commented 2 years ago

@xiay-lcw: that makes sense. I'll have to check whether the with context is missing a close or something like that.

jaanckae commented 1 year ago

This issue still persists. Are there any updates planned?

Exception ignored in: <function Pool.__del__ at 0x7f5b0a0a7820> Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/multiprocess/pool.py", line 268, in __del__ File "/usr/local/lib/python3.9/dist-packages/multiprocess/queues.py", line 374, in put File "/usr/local/lib/python3.9/dist-packages/multiprocess/reduction.py", line 54, in dumps File "/usr/local/lib/python3.9/dist-packages/multiprocess/reduction.py", line 42, in __init__ File "/usr/local/lib/python3.9/dist-packages/dill/_dill.py", line 573, in __init__ ImportError: sys.meta_path is None, Python is likely shutting down

Manually adding a pool.close() is not possible in my use case.

Thanks in advance Jasper

mmckerns commented 1 year ago

I'm unable to reproduce this error.

I've checked the most recent versions of python 3.7 - 3.11 with the most recent versions of dill, multiprocess, and pathos, and I don't see the error on Ubuntu 16.04 LTS, MacOS 10.14.6, or Windows 8.

Python 3.11.0rc2 (main, Sep 15 2022, 12:55:34) [Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pathos
>>> 
>>> with pathos.pools.ProcessPool(4) as pool:
...     print("hello world")
... 
hello world

How is the code being run? In the interpreter? In a file? In a jupyter notebook?

mmckerns commented 1 year ago

may be related to #208...

jaanckae commented 1 year ago

Running the code (pathos 0.2.9) in an apptainer instance with Python 3.9 on Debian

mmckerns commented 1 year ago

@jaanckae (and anyone else): what do you expect to happen to the pool within a with context? Do you expect it to close upon exiting the context? Currently, it does nothing. If I were to add a close(), then the pool would close... however, any new pool instance that is created with the same configuration would also be closed until restart() is called. Another option is to clear() the pool, which closes the pool then deletes the singleton... which would enable the creation of a new open pool -- however closing it could adversely affect any of the other handles to the pool singleton. Given the above, I eventually decided that the current behavior (not closing the pool) is the best on exit. What do you think?

Also, given some googling, I'm thinking that the current issue stems from the apptainer (and other containers people are using). Generally the error is seen when working with selenium and similar.

jaanckae commented 1 year ago

I'm just looking for a way to continue my workflow. If I can safely proceed with ignoring this bug, then ok. But I'm not entirely sure this will not affect the rest of my code.

What would be your suggestion in order to use it in combination with apptainer?

mmckerns commented 1 year ago

@jaanckae: apparently it's an innocuous error, so you can likely ignore it. However, I believe you can use one of the two choices I mentioned above. Add a close or clear to the pool within the context. Try it and let me know if that removes the behavior you are seeing.

bluenote10 commented 1 year ago

I'm unable to reproduce this error. [...] How is the code being run? In the interpreter? In a file? In a jupyter notebook?

This happens at the end of the process, so it is best to place the lines of my example in a file.

Do you expect it to close upon exiting the context?

My natural expectation was that it is calling close on the pool context, because implicitly calling "close" is like the number one use case / responsibility of context managers.

In fact I was very surprised to hear that calling close manually makes a difference, because I thought that was the purpose of using a context manager in the first place. Would feel a bit unidiomatic if the pattern context manager + manual close would be required as a work-around.

Currently, it does nothing.

Maybe I'm missing something, but what's the motivation of using a context manager then?

If I understand it correctly, the implications of calling close would only affect advanced use cases involving multiple pools. Personally I haven't encountered any such use case, so I'd be biased towards making the standard use case of a single pool behave more intuitively, i.e., close it.

mmckerns commented 1 year ago

@bluenote10: thanks for your input. Each of the pathos.pools are singletons, so if a ProcessPool is closed, then any "new" ProcessPool is also closed until either clear or restart is used. I say "new" in quotes because a new instance is actually just the previously created instance. I don't like a context manager calling close... because, any new pool instance will already be closed. To me, that is not intuitive at all... which is why I didn't close on __exit__ in the first place. How would you feel about a clear instead, which would then give a behavior that more mirrors a context manager for multiprocessing? Essentially, the pool is closed on __exit__ and cleared from memory, so creating a "new" pool would then actually be a new instance. In that case, you couldn't use context managers if you wanted to use multiple pools efficiently (and you'd probably not want to use an asynchronous workflow for multiple pools). I agree that currently the behavior of a context manager is a bit weird/unexpected, but I mostly wanted to give people the option of using that syntax. I'm on the fence about where the "expected" behavior break should be...

bluenote10 commented 1 year ago

How would you feel about a clear instead, which would then give a behavior that more mirrors a context manager for multiprocessing?

Yes that sounds like the most sensible solution. If I understand it correctly it would nicely handle the basic case of a single pool, and it avoid the two rather unexpected behaviours:

In that sense Process.clear has more "close-like semantics" then close itself ;)

mmckerns commented 1 year ago

In testing, adding a clear on __exit__ yields fairly sensible behavior... (or at least, it mimics multiprocessing in many but not all cases). However, in certain cases the pool hangs with a context. So, I need to figure out what's causing it to hang when clear is used in __exit__.

Just, FYI, as pathos.pools are singletons, a with context can be called like this, and the behavior should be no different than if created with a "new" pool:

Python 3.7.15 (default, Oct 12 2022, 04:11:53) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pathos.pools as pp
>>> import math
>>> p = pp.ProcessPool()
>>> p.map(math.sin, [1,2])
[0.8414709848078965, 0.9092974268256817]
>>> with p:
...   p.map(math.sin, [1,2])
... 
[0.8414709848078965, 0.9092974268256817]