state-spaces / mamba

Mamba SSM architecture
Apache License 2.0
12.7k stars 1.06k forks source link

AMD ROCm Autotrain failed due to ImportError: libc10_cuda.so: cannot open shared object file: No such file or directory #544

Open unclemusclez opened 1 month ago

unclemusclez commented 1 month ago

AMD ROCm 6.1.3 WSL2 Official PyTorch 2.1.2 https://rocm.blogs.amd.com/artificial-intelligence/mamba/README.html

ROCm users on WSL2 HAVE to use this version of python. It the only officially supported one and its the only one that works with everything as much as it can.

ERROR    | 2024-08-27 22:02:47 | autotrain.trainers.common:wrapper:120 - train has failed due to an exception: Traceback (most recent call last):
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1659, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/transformers/models/mamba2/modeling_mamba2.py", line 42, in <module>
    if is_mamba_2_ssm_available():
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 408, in is_mamba_2_ssm_available
    import mamba_ssm
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/mamba_ssm/__init__.py", line 3, in <module>
    from mamba_ssm.ops.selective_scan_interface import selective_scan_fn, mamba_inner_fn
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/mamba_ssm/ops/selective_scan_interface.py", line 16, in <module>
    import selective_scan_cuda
ImportError: libc10_cuda.so: cannot open shared object file: No such file or directory

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/autotrain/trainers/common.py", line 117, in wrapper
    return func(*args, **kwargs)
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/autotrain/trainers/sent_transformers/__main__.py", line 158, in train
    model = SentenceTransformer(
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 299, in __init__
    modules = self._load_auto_model(
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 1324, in _load_auto_model
    transformer_model = Transformer(
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/sentence_transformers/models/Transformer.py", line 54, in __init__
    self._load_model(model_name_or_path, config, cache_dir, **model_args)
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/sentence_transformers/models/Transformer.py", line 85, in _load_model
    self.auto_model = AutoModel.from_pretrained(
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
    model_class = _get_model_class(config, cls._model_mapping)
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 384, in _get_model_class
    supported_models = model_mapping[type(config)]
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 735, in __getitem__
    return self._load_attr_from_module(model_type, model_name)
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 749, in _load_attr_from_module
    return getattribute_from_module(self._modules[module_name], attr)
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 693, in getattribute_from_module
    if hasattr(module, attr):
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1649, in __getattr__
    module = self._get_module(self._class_to_module[name])
  File "/home/musclez/autotrain/.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1661, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.models.mamba2.modeling_mamba2 because of the following error (look up to see its traceback):
libc10_cuda.so: cannot open shared object file: No such file or directory

ERROR    | 2024-08-27 22:02:47 | autotrain.trainers.common:wrapper:121 - Failed to import transformers.models.mamba2.modeling_mamba2 because of the following error (look up to see its traceback):
libc10_cuda.so: cannot open shared object file: No such file or directory