open-mmlab / mmagic

OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.
https://mmagic.readthedocs.io/en/latest/
Apache License 2.0
6.94k stars 1.06k forks source link

[Bug] #1993

Open mycomedico opened 1 year ago

mycomedico commented 1 year ago

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmagic

Environment

My GPU is an AMD Radeon 6600, which means i needed to install pytorch with rocm in order to use it. I installed in conda environment with pip:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2

The rest I installed as per mmagic installation instructions:

conda install cudatoolkit=11.3 -c pytorch mim install 'mmcv>=2.0.0' mim install 'mmengine' mim install 'mmagic'

The first time I tried running the quick run script from the command line and got the error:

python: can't open file 'demo/mmagic_inference_demo.py'

so, I then tried running from inside a python shell, which worked even though I got a bunch of errors. here is a screenshot of my terminal:

Screenshot from 2023-08-21 16-38-28

Now trying to run it again from either the shell or command line it won't work at all, attached is a pic of my terminal errors:

Screenshot from 2023-08-21 17-18-33

Reproduces the problem - code sample

from mmagic.apis import MMagicInferencer

Reproduces the problem - command or script

from mmagic.apis import MMagicInferencer

Reproduces the problem - error message

/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_hip.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/models/archs/__init__.py:45: UserWarning: Diffusion Models are not registered as expect. If you want to use diffusion models, please install diffusers>=0.12.0.
  warnings.warn('Diffusion Models are not registered as expect. '
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/apis/__init__.py", line 2, in <module>
    from .mmagic_inferencer import MMagicInferencer
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/apis/mmagic_inferencer.py", line 10, in <module>
    from .inferencers import Inferencers
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/apis/inferencers/__init__.py", line 15, in <module>
    from .translation_inferencer import TranslationInferencer
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/apis/inferencers/translation_inferencer.py", line 12, in <module>
    from mmagic.models.base_models import BaseTranslationModel
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/models/__init__.py", line 6, in <module>
    from .editors import *  # noqa: F401, F403
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/models/editors/__init__.py", line 5, in <module>
    from .basicvsr_plusplus_net import BasicVSRPlusPlusNet
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/models/editors/basicvsr_plusplus_net/__init__.py", line 2, in <module>
    from .basicvsr_plusplus_net import BasicVSRPlusPlusNet
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/models/editors/basicvsr_plusplus_net/basicvsr_plusplus_net.py", line 5, in <module>
    from mmcv.ops import ModulatedDeformConv2d, modulated_deform_conv2d
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmcv/ops/__init__.py", line 2, in <module>
    from .active_rotated_filter import active_rotated_filter
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmcv/ops/active_rotated_filter.py", line 10, in <module>
    ext_module = ext_loader.load_ext(
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 13, in load_ext
    ext = importlib.import_module('mmcv.' + name)
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ImportError: libc10_hip.so: cannot open shared object file: No such file or directory

Additional information

No response

LeoXing1996 commented 1 year ago

Hey @mycomedico, I found some similar question in stackoverflow.

Can you try running import torch in your Python terminal and let me know if any errors occur?

mycomedico commented 1 year ago

I found that stackoverflow thread before i posted this and tried it lol. made no difference.

LeoXing1996 commented 1 year ago

Does the same error occur when you import torch?

mycomedico commented 1 year ago

yes

Screenshot from 2023-08-21 20-31-38

LeoXing1996 commented 1 year ago

OK, this seems a MMagic's problem. I'll debug and fix this as soon as possible.

mycomedico commented 1 year ago

lmk however i can help, if you need to remote in or something. going to bed now but will be available any evening, cheers

LeoXing1996 commented 1 year ago

@mycomedico, I made some potential relevant error fixing in #1995. You can pull the latest main branch to see if the errors still exist. If there still a error, you can try to build MMCV from source follow this url.

mycomedico commented 1 year ago

I won't be able to test this till tomorrow. I was testing mmagic today on my laptop instead of desktop and one thing I realized was that installing accelerate via pip install accelerate caused me some issues, so I am thinking that installing accelerate on my desktop as per the warning messages I received is part of the problem. Anyways, I will test tomorrow and let you know. thanks

mycomedico commented 1 year ago

OK, I just got a chance to try and do a fresh install of magic, which went well as I didn't see any errors, but I'm still not able to do the quick run example.

In a python shell. the first import statement is accepted, but the second command to create an object fails:

>>> sd_inferencer = MMagicInferencer(model_name='stable_diffusion')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/apis/mmagic_inferencer.py", line 140, in __init__
    self.inferencer = Inferencers(
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/apis/inferencers/__init__.py", line 80, in __init__
    self.inferencer = Text2ImageInferencer(
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/apis/inferencers/base_mmagic_inferencer.py", line 57, in __init__
    register_all_modules()
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/utils/setup_env.py", line 25, in register_all_modules
    import mmagic.evaluation  # noqa: F401,F403
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/evaluation/__init__.py", line 2, in <module>
    from .evaluator import Evaluator
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/evaluation/evaluator.py", line 12, in <module>
    from .metrics.base_gen_metric import GenMetric
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/evaluation/metrics/__init__.py", line 12, in <module>
    from .niqe import NIQE, niqe
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/evaluation/metrics/niqe.py", line 12, in <module>
    from mmagic.datasets.transforms import MATLABLikeResize
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/datasets/transforms/__init__.py", line 2, in <module>
    from .albu_function import AlbuCorruptFunction, PairedAlbuTransForms
  File "/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/datasets/transforms/albu_function.py", line 4, in <module>
    import albumentations as albu
ModuleNotFoundError: No module named 'albumentations'

pip install albumentations seems to solve this

mycomedico commented 1 year ago

ok it seems to work, but i did get some warnings as follows:


/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/models/archs/wrapper.py:149: FutureWarning: Accessing config attribute `block_out_channels` directly via 'AutoencoderKL' object attribute is deprecated. Please access 'block_out_channels' over 'AutoencoderKL's config object instead, e.g. 'unet.config.block_out_channels'.
  return getattr(self.model, name)
08/24 14:37:17 - mmengine - INFO - Set UNet dtype to 'torch.float32' in the eval mode.
08/24 14:37:17 - mmengine - WARNING - Failed to search registry with scope "mmagic" in the "function" registry tree. As a workaround, the current "function" registry in "mmengine" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmagic" is a correct scope, or whether the registry is initialized.
>>> text_prompts = 'A panda is having dinner at KFC'
>>> result_out_dir = 'output/sd_res.png'
>>> sd_inferencer.infer(text=text_prompts, result_out_dir=result_out_dir)
/home/rubing/miniconda3/envs/mmagic/lib/python3.8/site-packages/mmagic/models/archs/wrapper.py:149: FutureWarning: Accessing config attribute `in_channels` directly via 'UNet2DConditionModel' object attribute is deprecated. Please access 'in_channels' over 'UNet2DConditionModel's config object instead, e.g. 'unet.config.in_channels'.

also i still get the warning about the accelerate library, but i haven't installed that since it seems to break things.

`Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with: `
mycomedico commented 1 year ago

So, everything was working well for inferencing on a stable diffusion model, but when I tried to run a more challeging task of video super resolution I received a HIP memory error. I read a lot of message forums and it seems the best solution for an AMD cpu encountering this error is to basically run a docker image of pytorch specifically configured to run with amds cuda alternative software rocm. I am now running the docker image but having problems with installing mmagic. It seems one big error I get when trying to install mmcv is that my gcc compiler is not appropriate. For example, when building wheeels for mmcv i get this error here: #error You need C++17 to compile PyTorch

I don't understand why im getting this error b/c i upgraded to the latest gcc, any advice?

mycomedico commented 1 year ago

a lot of the errors in the log look like this as well:

 [6/136] c++ -MMD -MF /tmp/pip-install-9a530d23/mmcv_f424a3acedeb4c34a250d6bed29713e9/build/temp.linux-x86_64-3.8/mmcv/ops/csrc/pytorch/cpu/pixel_group.o.d -pthread -B /opt/conda/envs/mmagic/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DMMCV_WITH_HIP -DMMCV_WITH_CUDA -I/tmp/pip-install-9a530d23/mmcv_f424a3acedeb4c34a250d6bed29713e9/mmcv/ops/csrc/pytorch -I/tmp/pip-install-9a530d23/mmcv_f424a3acedeb4c34a250d6bed29713e9/mmcv/ops/csrc/common -I/tmp/pip-install-9a530d23/mmcv_f424a3acedeb4c34a250d6bed29713e9/mmcv/ops/csrc/common/cuda -I/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include -I/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/THC -I/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/THH -I/opt/rocm/include -I/opt/conda/envs/mmagic/include/python3.8 -c -c /tmp/pip-install-9a530d23/mmcv_f424a3acedeb4c34a250d6bed29713e9/mmcv/ops/csrc/pytorch/cpu/pixel_group.cpp -o /tmp/pip-install-9a530d23/mmcv_f424a3acedeb4c34a250d6bed29713e9/build/temp.linux-x86_64-3.8/mmcv/ops/csrc/pytorch/cpu/pixel_group.o -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -std=c++14 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0
      c++ -MMD -MF /tmp/pip-install-9a530d23/mmcv_f424a3acedeb4c34a250d6bed29713e9/build/temp.linux-x86_64-3.8/mmcv/ops/csrc/pytorch/cpu/pixel_group.o.d -pthread -B /opt/conda/envs/mmagic/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DMMCV_WITH_HIP -DMMCV_WITH_CUDA -I/tmp/pip-install-9a530d23/mmcv_f424a3acedeb4c34a250d6bed29713e9/mmcv/ops/csrc/pytorch -I/tmp/pip-install-9a530d23/mmcv_f424a3acedeb4c34a250d6bed29713e9/mmcv/ops/csrc/common -I/tmp/pip-install-9a530d23/mmcv_f424a3acedeb4c34a250d6bed29713e9/mmcv/ops/csrc/common/cuda -I/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include -I/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/THC -I/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/THH -I/opt/rocm/include -I/opt/conda/envs/mmagic/include/python3.8 -c -c /tmp/pip-install-9a530d23/mmcv_f424a3acedeb4c34a250d6bed29713e9/mmcv/ops/csrc/pytorch/cpu/pixel_group.cpp -o /tmp/pip-install-9a530d23/mmcv_f424a3acedeb4c34a250d6bed29713e9/build/temp.linux-x86_64-3.8/mmcv/ops/csrc/pytorch/cpu/pixel_group.o -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -std=c++14 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/c10/util/irange.h:50:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:12: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/core/ivalue_inl.h:1062:16: warning: structured bindings only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/Dispatch.h:56:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:8: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:108:10: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:214:10: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:574:10: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
      /opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/include/ATen/core/dispatch/Dispatcher.h:593:6: warning: ‘if constexpr’ only available with ‘-std=c++17’ or ‘-std=gnu++17’
mycomedico commented 1 year ago

I was just able to install mmagic, by only installing mmcvlight and then using pip to install mmagic instead of mim, but unfortunately it doesn't seem to work without full mmcv, this is what i get when trying to import from apis:

(mmagic) root@60de6dc11764:~# python Python 3.8.17 (default, Jul 5 2023, 21:04:15) [GCC 11.2.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information.

from mmagic.apis import MMagicInferencer /root/.local/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: /root/.local/lib/python3.8/site-packages/torchvision/image.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKNSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEE warn(f"Failed to load image Python extension: {e}") Traceback (most recent call last): File "", line 1, in File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/mmagic/apis/init.py", line 2, in from .inferencers.inference_functions import init_model File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/mmagic/apis/inferencers/init__.py", line 15, in from .translation_inferencer import TranslationInferencer File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/mmagic/apis/inferencers/translation_inferencer.py", line 12, in from mmagic.models.base_models import BaseTranslationModel File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/mmagic/models/init.py", line 6, in from .editors import * # noqa: F401, F403 File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/mmagic/models/editors/init.py", line 5, in from .basicvsr_plusplus_net import BasicVSRPlusPlusNet File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/mmagic/models/editors/basicvsr_plusplus_net/init.py", line 2, in from .basicvsr_plusplus_net import BasicVSRPlusPlusNet File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/mmagic/models/editors/basicvsr_plusplus_net/basicvsr_plusplus_net.py", line 5, in from mmcv.ops import ModulatedDeformConv2d, modulated_deform_conv2d File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/mmcv/ops/init.py", line 2, in from .active_rotated_filter import active_rotated_filter File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/mmcv/ops/active_rotated_filter.py", line 10, in ext_module = ext_loader.load_ext( File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 13, in load_ext ext = importlib.import_module('mmcv.' + name) File "/opt/conda/envs/mmagic/lib/python3.8/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) ModuleNotFoundError: No module named 'mmcv._ext'

mycomedico commented 1 year ago

I just found this! https://www.linkedin.com/pulse/running-ml-inference-amd-gpu-rocm-part-ii-luxoft-serbia

These guys had the same exact issues that I'm having and had to circumvent use of mmvc by porting the model code to another computer with nvidia first. any idea when mmvc will offer support for rocm?

mycomedico commented 1 year ago

I just tried compiling mmvc from source using the following options: MMCV_WITH_OPS=1 ROCM_HOME=/opt/rocm-5.6.0 python3 setup.py install 2> setuperr.txt This failed as follows:

Successfully preprocessed all matching files. /opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools. warnings.warn( /opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/command/easy_install.py:156: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools. warnings.warn( Emitting ninja build file /root/mmcv/build/temp.linux-x86_64-3.8/build.ninja... Compiling objects... Using envvar MAX_JOBS (32) as the number of workers... Traceback (most recent call last): File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build subprocess.run( File "/opt/conda/envs/mmagic/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v', '-j', '32']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "setup.py", line 437, in setup( File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/init.py", line 155, in setup return distutils.core.setup(**attrs) File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 148, in setup return run_commands(dist) File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 163, in run_commands dist.run_commands() File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 967, in run_commands self.run_command(cmd) File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 986, in run_command cmd_obj.run() File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/command/install.py", line 74, in run self.do_egg_install() File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/command/install.py", line 116, in do_egg_install self.run_command('bdist_egg') File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 986, in run_command cmd_obj.run() File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/command/bdist_egg.py", line 164, in run cmd = self.call_command('install_lib', warn_dir=0) File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/command/bdist_egg.py", line 150, in call_command self.run_command(cmdname) File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 986, in run_command cmd_obj.run() File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/command/install_lib.py", line 11, in run self.build() File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/command/install_lib.py", line 107, in build self.run_command('build_ext') File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 986, in run_command cmd_obj.run() File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 79, in run _build_ext.run(self) File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 339, in run self.build_extensions() File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 873, in build_extensions build_ext.build_extensions(self) File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 448, in build_extensions self._build_extensions_serial() File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 473, in _build_extensions_serial self.build_extension(ext) File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 202, in build_extension _build_ext.build_extension(self, ext) File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 528, in build_extension objects = self.compiler.compile(sources, File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 686, in unix_wrap_ninja_compile _write_ninja_file_and_compile_objects( File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects _run_ninja_build( File "/opt/conda/envs/mmagic/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error compiling objects for extension

mycomedico commented 1 year ago

it seems part of the problem here is that these modules require using c++17 to compile but instead is trying to compile with c++14, so i edited setup.py to force the use of c++17, unfortunately it did not work and still getting the same errors :( should i post this all in mmcv instead?

LeoXing1996 commented 1 year ago

Sorry for late reponse. Can you try to add path of c++17 to your environment variable instead of changing setup.py?

mycomedico commented 1 year ago

i'm sorry i'm not exactly sure how to do that. is that something i set in my bash shell? can you tell me explicitly what to do? thanks!

LeoXing1996 commented 1 year ago

For example, if you want to add gcc-8.5.0 to your PATH, you can add the following lines to your .bashrc:

export PATH=${GCC_HOME}/bin:$PATH
mycomedico commented 1 year ago

I tried what you are suggesting and now get a new permissions type error. I tried overcoming the permissions error by running the compile as sudo, but in that case mmcv did not then seem available to my user account. Here is the permissions error I get now when trying to compile mmcv as user:

Total number of replaced kernel launches: 182 running install /usr/local/lib/python3.10/dist-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools. warnings.warn( /usr/local/lib/python3.10/dist-packages/setuptools/command/easy_install.py:156: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools. warnings.warn( error: can't create or remove files in install directory

The following error occurred while trying to add or remove files in the installation directory:

[Errno 13] Permission denied: '/usr/local/lib/python3.10/dist-packages/test-easy-install-31226.write-test'

The installation directory you specified (via --install-dir, --prefix, or the distutils default setting) was:

/usr/local/lib/python3.10/dist-packages/

Perhaps your account does not have write access to this directory? If the installation directory is a system-owned directory, you may need to sign in as the administrator or "root" account. If you do not have administrative access to this machine, you may wish to choose a different installation directory, preferably one that is listed in your PYTHONPATH environment variable.

For information on other options, you may wish to consult the documentation at:

https://setuptools.pypa.io/en/latest/deprecated/easy_install.html

Please make the appropriate changes for your system and try again.