Closed alex-rakowski closed 11 months ago
Hello, thanks for reporting this, I just noticed it.
I think this is stupid of Colab as what is the point to install Cupy on a machine without a GPU (it is like putting F1 tires on a small car ;) ). Luckily we have a simple work around:
Do:
! pip install pylops
import pylops
and
import os
os.environ['CUPY_PYLOPS'] = '0'
and then everything will work file (see here for more details https://pylops.readthedocs.io/en/latest/gpu.html)
I agree it seems like a weird environment choice, but I have collaborators who have an HPC environment similar to this, so I think it goes beyond just Colab. I'm not sure what the advantages are, but presumably, there must be some.
I worked out the fix of using environment variables, and it makes sense when using pylops
directly, but when importing a package with pylops as a dependency, knowing pylops
was the cause of the error was non-obvious.
The proposed fix I suggested would safeguard against this a little bit more than find_spec
does. This would also help with environments with cupy
installed but with incorrect drivers/cuda etc.
Sorry, my phone showed me just part of your message so I didn't realize you had proposed a solution. For sure, you have identified a problem that we never thought about before, and we should make our code able to run as dependency of other projects in environments like Colab :)
Alright, a couple of questions:
pylops
on colab, as the error message doesn't seem to point to any line in pylops but just cupy... which is very strange. I would assume it happens here https://github.com/PyLops/pylops/blob/cc52ae8dacbb7310f8c54f8fb8666f9e689f66c9/pylops/utils/backend.py#L35, but maybe in your collaborators HPC environment they can see in the error message where this occurs and confirm this is the case?cupy_enabled
unchanged, and add a function called cupy_import
where the try/except catch is placed, then in the backend.py file do what is done in for example kirchhoff.py here jit_message = deps.numba_import("the kirchhoff module")
. What do you think?Alright, a couple of questions:
- I am still a bit puzzled where the error arises when importing
pylops
on colab, as the error message doesn't seem to point to any line in pylops but just cupy... which is very strange. I would assume it happens here https://github.com/PyLops/pylops/blob/cc52ae8dacbb7310f8c54f8fb8666f9e689f66c9/pylops/utils/backend.py#L35 , but maybe in your collaborators HPC environment they can see in the error message where this occurs and confirm this is the case?
The error message raised is not very helpful which is one of the reasons it took so long to patch, I assumed it was cupy related and didnt think to check pylops
. I can follow up on the error message raised on HPC, but we found a workaround for their problem for them so not sure if they'll get back to us with the error messages.
I found the error in colab with the following:
import traceback
try:
import pylops
except Exception:
print(traceback.format_exc())
Which outputs:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/cupy/__init__.py", line 18, in <module>
from cupy import _core # NOQA
File "/usr/local/lib/python3.10/dist-packages/cupy/_core/__init__.py", line 3, in <module>
from cupy._core import core # NOQA
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<ipython-input-3-94acb765c275>", line 4, in <cell line: 3>
import pylops
File "/usr/local/lib/python3.10/dist-packages/pylops/__init__.py", line 49, in <module>
from .linearoperator import *
File "/usr/local/lib/python3.10/dist-packages/pylops/linearoperator.py", line 34, in <module>
from pylops.optimization.basic import cgls
File "/usr/local/lib/python3.10/dist-packages/pylops/optimization/basic.py", line 9, in <module>
from pylops.optimization.cls_basic import CG, CGLS, LSQR
File "/usr/local/lib/python3.10/dist-packages/pylops/optimization/cls_basic.py", line 12, in <module>
from pylops.optimization.basesolver import Solver
File "/usr/local/lib/python3.10/dist-packages/pylops/optimization/basesolver.py", line 8, in <module>
from pylops.optimization.callback import Callbacks
File "/usr/local/lib/python3.10/dist-packages/pylops/optimization/callback.py", line 8, in <module>
from pylops.utils.metrics import mae, mse, psnr, snr
File "/usr/local/lib/python3.10/dist-packages/pylops/utils/__init__.py", line 2, in <module>
from .backend import *
File "/usr/local/lib/python3.10/dist-packages/pylops/utils/backend.py", line 35, in <module>
import cupy as cp
File "/usr/local/lib/python3.10/dist-packages/cupy/__init__.py", line 20, in <module>
raise ImportError(f'''
ImportError:
================================================================
Failed to import CuPy.
If you installed CuPy via wheels (cupy-cudaXXX or cupy-rocm-X-X), make sure that the package matches with the version of CUDA or ROCm installed.
On Linux, you may need to set LD_LIBRARY_PATH environment variable depending on how you installed CUDA/ROCm.
On Windows, try setting CUDA_PATH environment variable.
Check the Installation Guide for details:
https://docs.cupy.dev/en/latest/install.html
Original error:
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
================================================================
- I guess your suggestion would go here (https://github.com/PyLops/pylops/blob/cc52ae8dacbb7310f8c54f8fb8666f9e689f66c9/pylops/utils/deps.py#L18 ) and probably we should do the same for cusignal? even better we can follow the same pattern as all the other optional dependencies. Have the
cupy_enabled
unchanged, and add a function calledcupy_import
where the try/except catch is placed, then in the backend.py file do what is done in for example kirchhoff.py herejit_message = deps.numba_import("the kirchhoff module")
. What do you think?
This is where I was thinking of making the edit, as it looks like this is where it propagates the cupy
state to the rest of the package. I would think this would work for cusignal
and other optional dependencies. It could be done with importlib.load_module
so it could be done programmatically rather than manually writing try/except
block for each optional dependency, and importlib
can scrape optional dependencies from package metadata which might help to avoid maintaining requirements/optional dependencies in multiple locations.
Thanks for the tip about traceback
, I wasn't familiar with it. And indeed, the error happens where I expected it to happen, but it is strange the standard error stops at cupy and doesn't propagate all the way to where the error originated.
I suggest you try to make a PR with your suggested solution (using importlib.load_module
) and we can take it from there. I would likely ask another main dev for his opinion and once we all agree propagate whatever solution we choose to all other optional dependencies so we don't start creating multiple strategies ;)
One additional point: reading about importlib.load_module
I see the following comment in https://docs.python.org/3/library/importlib.html
Deprecated since version 3.4: The recommended API for loading a module is [exec_module()]
(https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.exec_module) (and [create_module()]
(https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.create_module)). Loaders should implement it instead of [load_module()](https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module). The import machinery takes care of all the other responsibilities of [load_module()]
(https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module) when [exec_module()](https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.exec_module) is implemented.
I don't think we should use deprecated methods as sooner than later those will be removed in new versions of import.. maybe exec_module()
could do.
One additional point: reading about
importlib.load_module
I see the following comment in https://docs.python.org/3/library/importlib.htmlDeprecated since version 3.4: The recommended API for loading a module is [exec_module()] (https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.exec_module) (and [create_module()] (https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.create_module)). Loaders should implement it instead of [load_module()](https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module). The import machinery takes care of all the other responsibilities of [load_module()] (https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module) when [exec_module()](https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.exec_module) is implemented.
I don't think we should use deprecated methods as sooner than later those will be removed in new versions of import.. maybe
exec_module()
could do.
good catch, I should have typed importlib.import_module
which replicates import
more closely and isn't deprecated
One additional point: reading about
importlib.load_module
I see the following comment in https://docs.python.org/3/library/importlib.htmlDeprecated since version 3.4: The recommended API for loading a module is [exec_module()] (https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.exec_module) (and [create_module()] (https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.create_module)). Loaders should implement it instead of [load_module()](https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module). The import machinery takes care of all the other responsibilities of [load_module()] (https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module) when [exec_module()](https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.exec_module) is implemented.
I don't think we should use deprecated methods as sooner than later those will be removed in new versions of import.. maybe
exec_module()
could do.
Awesome, I'll start a draft PR with a rough outline for what I envisaged, and you can tag whoever else should look it over, and we can discuss the details there. I haven't PR'd into this repo before, so the first commits might not be up to scratch but will just serve as an initial outline. The contributing docs look detailed, so hopefully, it'll be straightforward to get it up to standard once you're/we're happy with the code functionality
Importing
pylops
in the default session of Colab with a CPU runtime leads to anImportError
. Colab now hascupy-cuda11x==11.0.0
installed in the default environment. Consequently your test fails to pick up that trying to import cupy will lead to anImportError
This can be solved in colab by adding:
This isn't the most robust and wouldn't help any packages using
pylops
as a dependency.I was going to PR a change along the lines of, but it would be great to get some feedback first
Steps to reproduce:
In a new notebook on google colab, with a new CPU runtime: