pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.72k stars 2.02k forks source link

Make multiprocessing import local to support pyodide #7519

Open twiecki opened 1 month ago

twiecki commented 1 month ago

Description

We almost have PyMC working natively under pyodide. But upon import we get an error importing multiprocessing which isn't included in pyodide:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[5], line 1
----> 1 import pymc as pm
      2 pm.__version__

File /lib/python3.12/site-packages/pymc/__init__.py:72
     70 from pymc.pytensorf import *
     71 from pymc.sampling import *
---> 72 from pymc.smc import *
     73 from pymc.stats import *
     74 from pymc.step_methods import *

File /lib/python3.12/site-packages/pymc/smc/__init__.py:16
      1 #   Copyright 2024 The PyMC Developers
      2 #
      3 #   Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     12 #   See the License for the specific language governing permissions and
     13 #   limitations under the License.
     15 from pymc.smc.kernels import IMH, MH
---> 16 from pymc.smc.sampling import sample_smc
     18 __all__ = ("sample_smc",)

File /lib/python3.12/site-packages/pymc/smc/sampling.py:21
     18 import warnings
     20 from collections import defaultdict
---> 21 from concurrent.futures import ProcessPoolExecutor, wait
     22 from typing import Any
     24 import cloudpickle

File /lib/python312.zip/concurrent/futures/__init__.py:44, in __getattr__(name)
     41 global ProcessPoolExecutor, ThreadPoolExecutor
     43 if name == 'ProcessPoolExecutor':
---> 44     from .process import ProcessPoolExecutor as pe
     45     ProcessPoolExecutor = pe
     46     return pe

File /lib/python312.zip/concurrent/futures/process.py:55
     52 # This import is required to load the multiprocessing.connection submodule
     53 # so that it can be accessed later as `mp.connection`
     54 import multiprocessing.connection
---> 55 from multiprocessing.queues import Queue
     56 import threading
     57 import weakref

File /lib/python312.zip/multiprocessing/queues.py:23
     19 import errno
     21 from queue import Empty, Full
---> 23 import _multiprocessing
     25 from . import connection
     26 from . import context

ModuleNotFoundError: No module named '_multiprocessing'

If we made that optional, it will work out of the box.

ricardoV94 commented 1 month ago

I don't think it makes sense to go out of our way to make a standard python lib import optional. It's an anti-pattern to import things inside functions.

twiecki commented 1 month ago

I agree it's an anti-pattern, but I think it certainly makes sense here because with a pretty simple change we enable PyMC in the browser. So I think the added tech debt is worth it. We can also make it optional if we're running on pyodide.

ricardoV94 commented 1 month ago

We can also make it optional if we're running on pyodide.

How does that look like?

twiecki commented 1 month ago

if 'pyodide' in sys.modules:...

On Wed, Oct 2, 2024, 18:23 Ricardo Vieira @.***> wrote:

We can also make it optional if we're running on pyodide.

How does that look like?

— Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc/issues/7519#issuecomment-2388300700, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFETGC4KO56WKJ7IOHONKDZZPCQ3AVCNFSM6AAAAABPHH3TUSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBYGMYDANZQGA . You are receiving this because you authored the thread.Message ID: @.***>

ricardoV94 commented 1 month ago

That's even uglier

twiecki commented 1 month ago

So what do we do?

ricardoV94 commented 1 month ago

Local import or nothing. Should an issue be open with pyodide as well?

twiecki commented 1 month ago

OK, I don't mind which way. What issue would that be?

ricardoV94 commented 1 month ago

OK, I don't mind which way. What issue would that be?

Shouldn't pyodide support the python standard library?

twiecki commented 1 month ago

Shouldn't pyodide support the python standard library?

I don't think it's an oversight, they just haven't figured out multiprocessing yet.

adithyalaks commented 2 weeks ago

Looking at this from the PyData NYC sprint. I don't really see how we could unblock this given that multiprocessing is used in various other modules in PyMC. Would we make the import local to all of the modules that use it??

twiecki commented 2 weeks ago

Where else is it being used? I guess that'd be the only way then, depends in how many modules we use it.