PyLops / pylops

PyLops – A Linear-Operator Library for Python
https://pylops.readthedocs.io
GNU Lesser General Public License v3.0
431 stars 102 forks source link

ImportError on default Colab session - CPU Runtime #548

Closed alex-rakowski closed 11 months ago

alex-rakowski commented 1 year ago

Importing pylops in the default session of Colab with a CPU runtime leads to an ImportError. Colab now has cupy-cuda11x==11.0.0 installed in the default environment. Consequently your test fails to pick up that trying to import cupy will lead to an ImportError

cupy_enabled = (
    util.find_spec("cupy") is not None and int(os.getenv("CUPY_PYLOPS", 1)) == 1
)

This can be solved in colab by adding:

import os
os.environ["CUPY_PYLOPS"] = "0"
import pylops

This isn't the most robust and wouldn't help any packages using pylops as a dependency.

I was going to PR a change along the lines of, but it would be great to get some feedback first

try:
  import cupy
  cupy_enabled = (
  util.find_spec("cupy") is not None and int(os.getenv("CUPY_PYLOPS", 1)) == 1 # not sure this would be required anymore
  )
except (ModuleNotFound, ImportError):
  cupy_enabled = False

Steps to reproduce:

In a new notebook on google colab, with a new CPU runtime:

! pip install pylops
import pylops
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/cupy/__init__.py](https://5pxqthg4jr-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20231120-060114_RC00_583990929#) in <module>
     17 try:
---> 18     from cupy import _core  # NOQA
     19 except ImportError as exc:

10 frames
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

The above exception was the direct cause of the following exception:

ImportError                               Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/cupy/__init__.py](https://5pxqthg4jr-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20231120-060114_RC00_583990929#) in <module>
     18     from cupy import _core  # NOQA
     19 except ImportError as exc:
---> 20     raise ImportError(f'''
     21 ================================================================
     22 {_environment._diagnose_import_error()}

ImportError: 
================================================================
Failed to import CuPy.

If you installed CuPy via wheels (cupy-cudaXXX or cupy-rocm-X-X), make sure that the package matches with the version of CUDA or ROCm installed.

On Linux, you may need to set LD_LIBRARY_PATH environment variable depending on how you installed CUDA/ROCm.
On Windows, try setting CUDA_PATH environment variable.

Check the Installation Guide for details:
  https://docs.cupy.dev/en/latest/install.html

Original error:
  ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
================================================================
mrava87 commented 1 year ago

Hello, thanks for reporting this, I just noticed it.

I think this is stupid of Colab as what is the point to install Cupy on a machine without a GPU (it is like putting F1 tires on a small car ;) ). Luckily we have a simple work around:

Do:

! pip install pylops
import pylops

and

import os
os.environ['CUPY_PYLOPS'] = '0'

and then everything will work file (see here for more details https://pylops.readthedocs.io/en/latest/gpu.html)

alex-rakowski commented 1 year ago

I agree it seems like a weird environment choice, but I have collaborators who have an HPC environment similar to this, so I think it goes beyond just Colab. I'm not sure what the advantages are, but presumably, there must be some.

I worked out the fix of using environment variables, and it makes sense when using pylops directly, but when importing a package with pylops as a dependency, knowing pylops was the cause of the error was non-obvious.

The proposed fix I suggested would safeguard against this a little bit more than find_spec does. This would also help with environments with cupy installed but with incorrect drivers/cuda etc.

mrava87 commented 1 year ago

Sorry, my phone showed me just part of your message so I didn't realize you had proposed a solution. For sure, you have identified a problem that we never thought about before, and we should make our code able to run as dependency of other projects in environments like Colab :)

Alright, a couple of questions:

alex-rakowski commented 1 year ago

Alright, a couple of questions:

The error message raised is not very helpful which is one of the reasons it took so long to patch, I assumed it was cupy related and didnt think to check pylops. I can follow up on the error message raised on HPC, but we found a workaround for their problem for them so not sure if they'll get back to us with the error messages.

I found the error in colab with the following:

import traceback

try:
  import pylops
except Exception:
    print(traceback.format_exc())

Which outputs:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/cupy/__init__.py", line 18, in <module>
    from cupy import _core  # NOQA
  File "/usr/local/lib/python3.10/dist-packages/cupy/_core/__init__.py", line 3, in <module>
    from cupy._core import core  # NOQA
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<ipython-input-3-94acb765c275>", line 4, in <cell line: 3>
    import pylops
  File "/usr/local/lib/python3.10/dist-packages/pylops/__init__.py", line 49, in <module>
    from .linearoperator import *
  File "/usr/local/lib/python3.10/dist-packages/pylops/linearoperator.py", line 34, in <module>
    from pylops.optimization.basic import cgls
  File "/usr/local/lib/python3.10/dist-packages/pylops/optimization/basic.py", line 9, in <module>
    from pylops.optimization.cls_basic import CG, CGLS, LSQR
  File "/usr/local/lib/python3.10/dist-packages/pylops/optimization/cls_basic.py", line 12, in <module>
    from pylops.optimization.basesolver import Solver
  File "/usr/local/lib/python3.10/dist-packages/pylops/optimization/basesolver.py", line 8, in <module>
    from pylops.optimization.callback import Callbacks
  File "/usr/local/lib/python3.10/dist-packages/pylops/optimization/callback.py", line 8, in <module>
    from pylops.utils.metrics import mae, mse, psnr, snr
  File "/usr/local/lib/python3.10/dist-packages/pylops/utils/__init__.py", line 2, in <module>
    from .backend import *
  File "/usr/local/lib/python3.10/dist-packages/pylops/utils/backend.py", line 35, in <module>
    import cupy as cp
  File "/usr/local/lib/python3.10/dist-packages/cupy/__init__.py", line 20, in <module>
    raise ImportError(f'''
ImportError: 
================================================================
Failed to import CuPy.

If you installed CuPy via wheels (cupy-cudaXXX or cupy-rocm-X-X), make sure that the package matches with the version of CUDA or ROCm installed.

On Linux, you may need to set LD_LIBRARY_PATH environment variable depending on how you installed CUDA/ROCm.
On Windows, try setting CUDA_PATH environment variable.

Check the Installation Guide for details:
  https://docs.cupy.dev/en/latest/install.html

Original error:
  ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
================================================================
  • I guess your suggestion would go here (https://github.com/PyLops/pylops/blob/cc52ae8dacbb7310f8c54f8fb8666f9e689f66c9/pylops/utils/deps.py#L18 ) and probably we should do the same for cusignal? even better we can follow the same pattern as all the other optional dependencies. Have the cupy_enabled unchanged, and add a function called cupy_import where the try/except catch is placed, then in the backend.py file do what is done in for example kirchhoff.py here jit_message = deps.numba_import("the kirchhoff module"). What do you think?

This is where I was thinking of making the edit, as it looks like this is where it propagates the cupy state to the rest of the package. I would think this would work for cusignal and other optional dependencies. It could be done with importlib.load_module so it could be done programmatically rather than manually writing try/except block for each optional dependency, and importlib can scrape optional dependencies from package metadata which might help to avoid maintaining requirements/optional dependencies in multiple locations.

mrava87 commented 1 year ago

Thanks for the tip about traceback, I wasn't familiar with it. And indeed, the error happens where I expected it to happen, but it is strange the standard error stops at cupy and doesn't propagate all the way to where the error originated.

I suggest you try to make a PR with your suggested solution (using importlib.load_module) and we can take it from there. I would likely ask another main dev for his opinion and once we all agree propagate whatever solution we choose to all other optional dependencies so we don't start creating multiple strategies ;)

mrava87 commented 1 year ago

One additional point: reading about importlib.load_module I see the following comment in https://docs.python.org/3/library/importlib.html

Deprecated since version 3.4: The recommended API for loading a module is [exec_module()]
(https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.exec_module) (and [create_module()]
(https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.create_module)). Loaders should implement it instead of [load_module()](https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module). The import machinery takes care of all the other responsibilities of [load_module()]
(https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module) when [exec_module()](https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.exec_module) is implemented.

I don't think we should use deprecated methods as sooner than later those will be removed in new versions of import.. maybe exec_module() could do.

alex-rakowski commented 1 year ago

One additional point: reading about importlib.load_module I see the following comment in https://docs.python.org/3/library/importlib.html

Deprecated since version 3.4: The recommended API for loading a module is [exec_module()]
(https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.exec_module) (and [create_module()]
(https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.create_module)). Loaders should implement it instead of [load_module()](https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module). The import machinery takes care of all the other responsibilities of [load_module()]
(https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module) when [exec_module()](https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.exec_module) is implemented.

I don't think we should use deprecated methods as sooner than later those will be removed in new versions of import.. maybe exec_module() could do.

good catch, I should have typed importlib.import_module which replicates import more closely and isn't deprecated

alex-rakowski commented 1 year ago

One additional point: reading about importlib.load_module I see the following comment in https://docs.python.org/3/library/importlib.html

Deprecated since version 3.4: The recommended API for loading a module is [exec_module()]
(https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.exec_module) (and [create_module()]
(https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.create_module)). Loaders should implement it instead of [load_module()](https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module). The import machinery takes care of all the other responsibilities of [load_module()]
(https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module) when [exec_module()](https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.exec_module) is implemented.

I don't think we should use deprecated methods as sooner than later those will be removed in new versions of import.. maybe exec_module() could do.

Awesome, I'll start a draft PR with a rough outline for what I envisaged, and you can tag whoever else should look it over, and we can discuss the details there. I haven't PR'd into this repo before, so the first commits might not be up to scratch but will just serve as an initial outline. The contributing docs look detailed, so hopefully, it'll be straightforward to get it up to standard once you're/we're happy with the code functionality