Open zRzRzRzRzRzRzR opened 2 years ago
Hi! how do you install MMCV2.0.0rc1? We've not provided pre-built package for rocm, so you need to compile MMCV2.0 from source.
Hi! how do you install MMCV2.0.0rc1? We've not provided pre-built package for rocm, so you need to compile MMCV2.0 from source.
Thank you for your reply. I found the mmcv2.0.0rc1 source code on the "Release" page, which was released on Aug 31, 2022. And I tried to use method like issue #1394
(venv) /media/zr/Data/MMLAB_2.0/mmcv-2.0.0rc1 MMCV_WITH_OPS=1 pip install -e .
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Obtaining file:///media/zr/Data/MMLAB_2.0/mmcv-2.0.0rc1
Preparing metadata (setup.py) ... done
Requirement already satisfied: addict in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from mmcv==2.0.0rc1) (2.4.0)
Requirement already satisfied: mmengine in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from mmcv==2.0.0rc1) (0.1.0)
Requirement already satisfied: numpy in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from mmcv==2.0.0rc1) (1.23.3)
Requirement already satisfied: packaging in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from mmcv==2.0.0rc1) (21.3)
Requirement already satisfied: Pillow in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from mmcv==2.0.0rc1) (9.2.0)
Requirement already satisfied: pyyaml in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from mmcv==2.0.0rc1) (6.0)
Requirement already satisfied: yapf in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from mmcv==2.0.0rc1) (0.32.0)
Requirement already satisfied: matplotlib in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from mmengine->mmcv==2.0.0rc1) (3.6.0)
Requirement already satisfied: opencv-python>=3 in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from mmengine->mmcv==2.0.0rc1) (4.6.0.66)
Requirement already satisfied: termcolor in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from mmengine->mmcv==2.0.0rc1) (2.0.1)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from packaging->mmcv==2.0.0rc1) (3.0.9)
Requirement already satisfied: fonttools>=4.22.0 in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from matplotlib->mmengine->mmcv==2.0.0rc1) (4.37.4)
Requirement already satisfied: kiwisolver>=1.0.1 in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from matplotlib->mmengine->mmcv==2.0.0rc1) (1.4.4)
Requirement already satisfied: python-dateutil>=2.7 in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from matplotlib->mmengine->mmcv==2.0.0rc1) (2.8.2)
Requirement already satisfied: cycler>=0.10 in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from matplotlib->mmengine->mmcv==2.0.0rc1) (0.11.0)
Requirement already satisfied: contourpy>=1.0.1 in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from matplotlib->mmengine->mmcv==2.0.0rc1) (1.0.5)
Requirement already satisfied: six>=1.5 in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages (from python-dateutil>=2.7->matplotlib->mmengine->mmcv==2.0.0rc1) (1.16.0)
Installing collected packages: mmcv
Running setup.py develop for mmcv
error: subprocess-exited-with-error
× python setup.py develop did not run successfully.
│ exit code: 1
╰─> [1064 lines of output]
If I use an older version of mmcv, this will affect the operation of the mmyolo
module, as this module requires mmcv version 2.0.0rc1. Release 1.6.2(latest) also has this problem. Compiling through the source code, the problem is not solved.The error is below.
/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/torch/include/c10/util/complex.h:8:10: fatal error: 'thrust/complex.h' file not found
#include <thrust/complex.h>
^~~~~~~~~~~~~~~~~~
26 warnings and 1 error generated when compiling for gfx1030.
error: command '/opt/rocm-5.2.1/bin/hipcc' failed with exit code 1
What should I do?
Hi, sorry for my late reply, you need to compile mmcv like this:
MMCV_WITH_OPS=1 ROCM_HOME=/opt/rocm-4.0.0 python3 setup.py install
where ROCM_HOME
is your local path to your rocm enviroment.
Describe the Issue Following the method you provided does not solve the problem. cuda:0 cannot be found in either the virtual or physical environment. My guess is that there is a problem calling the cuda operator in the _ext module. Error traceback is the same as the previous problem. Error traceback
RuntimeError: nms_impl: implementation for device cuda:0 not found.
Exception raised from Dispatch at /tmp/mmcv/mmcv/ops/csrc/common/pytorch_device_registry.hpp:122 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f2684e43ab2 in /home/zr/.local/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5b (0x7f2684e4014b in /home/zr/.local/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #2: nms_impl(at::Tensor, at::Tensor, float, int) + 0xa97 (0x7f25534e2847 in /home/zr/.local/lib/python3.10/site-packages/mmcv/_ext.cpython-310-x86_64-linux-gnu.so)
frame #3: nms(at::Tensor, at::Tensor, float, int) + 0x4f (0x7f25534e2fcf in /home/zr/.local/lib/python3.10/site-packages/mmcv/_ext.cpython-310-x86_64-linux-gnu.so)
frame #4: <unknown function> + 0x12515b (0x7f255352515b in /home/zr/.local/lib/python3.10/site-packages/mmcv/_ext.cpython-310-x86_64-linux-gnu.so)
frame #5: <unknown function> + 0x11224f (0x7f255351224f in /home/zr/.local/lib/python3.10/site-packages/mmcv/_ext.cpython-310-x86_64-linux-gnu.so)
<omitting python frames>
frame #10: THPFunction_apply(_object*, _object*) + 0xb57 (0x7f2700d24937 in /home/zr/.local/lib/python3.10/site-packages/torch/lib/libtorch_python.so)
frame #57: <unknown function> + 0x29d90 (0x7f2723c29d90 in /lib/x86_64-linux-gnu/libc.so.6)
frame #58: __libc_start_main + 0x80 (0x7f2723c29e40 in /lib/x86_64-linux-gnu/libc.so.6)
I can't downgrade my Rocm to a lower version, the version I'm using is Rocm-5.2.1, so I can't be sure if it's because of a version problem.
The compilation process is shown in this message, and I think it was done successfully.
sudo MMCV_WITH_OPS=1 ROCM_HOME=/opt/rocm-5.2.1 python3 setup.py install
[sudo] password for zr:
Skip building ext ops due to the absence of torch.
running install
/usr/lib/python3/dist-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
/usr/lib/python3/dist-packages/setuptools/command/easy_install.py:158: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
/usr/lib/python3/dist-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 1.16.0-unknown is an invalid version and will not be supported in a future release
warnings.warn(
/usr/lib/python3/dist-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 1.1build1 is an invalid version and will not be supported in a future release
warnings.warn(
/usr/lib/python3/dist-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 0.1.43ubuntu1 is an invalid version and will not be supported in a future release
warnings.warn(
running bdist_egg
running egg_info
writing mmcv.egg-info/PKG-INFO
writing dependency_links to mmcv.egg-info/dependency_links.txt
writing requirements to mmcv.egg-info/requires.txt
writing top-level names to mmcv.egg-info/top_level.txt
reading manifest file 'mmcv.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
...
Using /usr/local/lib/python3.10/dist-packages/cycler-0.11.0-py3.10.egg
Searching for contourpy==1.0.5
Best match: contourpy 1.0.5
Processing contourpy-1.0.5-py3.10-linux-x86_64.egg
contourpy 1.0.5 is already the active version in easy-install.pth
Using /usr/local/lib/python3.10/dist-packages/contourpy-1.0.5-py3.10-linux-x86_64.egg
Finished processing dependencies for mmcv==2.0.0rc1
The mmcv python package can also be found successfully in the local environment.
pip list | grep mmcv
mmcv 2.0.0rc1 /home/zr/.local/lib/python3.10/site-packages
It seems the building has been skipped for the absence of torch
.
Maybe it's the python version or something, I was able to compile pytorch in my local environment and it works fine.This still confuses me.
~ pip list | grep torch
torch 1.12.1+rocm5.1.1
torchaudio 0.12.1+rocm5.1.1
torchvision 0.13.1+rocm5.1.1
~ python
Python 3.10.6 (main, Aug 10 2022, 11:40:04) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.get_device_properties(torch.device('cuda:0'))
_CudaDeviceProperties(name='AMD Radeon RX 6800 XT', major=10, minor=3, total_memory=16368MB, multi_processor_count=36)
>>> torch.cuda.is_available()
True
My guess is that it might be because my torch is a package downloaded from the official website using pip, and I'll try next to see if compiling the torch using the source code will fix the problem. Thank you for your kind help.
have you ever solved this problem? i cant get over this either
Hi, sorry for my late reply, you need to compile mmcv like this:
MMCV_WITH_OPS=1 ROCM_HOME=/opt/rocm-4.0.0 python3 setup.py install
where
ROCM_HOME
is your local path to your rocm enviroment.
i have same problem as Author
error: #include <thrust/complex.h>
^~~~~~
26 warnings and 1 error generated when compiling for gfx1030.
Hi, sorry for my late reply, you need to compile mmcv like this:
MMCV_WITH_OPS=1 ROCM_HOME=/opt/rocm-4.0.0 python3 setup.py install
where
ROCM_HOME
is your local path to your rocm enviroment.i have same problem as Author error: #include <thrust/complex.h> ^
~~~~~ 26 warnings and 1 error generated when compiling for gfx1030.
Hi, have you installed the rocm?
Checklist
Describe the Issue
I created a new environment when I configured MMyolo, and after configuring it according to the documentation, when I run the demo program, the following error is reported if I specify the GPU as Cuda.
1 What command, code, or script did you run?
Environment
TorchVision: 0.13.1+rocm5.1.1 OpenCV: 4.6.0 MMEngine: 0.1.0 MMCV: 2.0.0rc1 MMDetection: 3.0.0rc1 MMYOLO: 0.1.1+
Traceback (most recent call last): File "/media/zr/Data/MMLAB_2.0/mmyolo/demo/image_demo.py", line 61, in
main(args)
File "/media/zr/Data/MMLAB_2.0/mmyolo/demo/image_demo.py", line 43, in main
result = inference_detector(model, args.img)
File "/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmdet/apis/inference.py", line 152, in inference_detector
results = model.teststep(data)[0]
File "/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmengine/model/base_model/base_model.py", line 145, in test_step
return self._run_forward(data, mode='predict') # type: ignore
File "/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmengine/model/base_model/base_model.py", line 298, in _run_forward
results = self(data, mode=mode)
File "/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(input, kwargs)
File "/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmdet/models/detectors/base.py", line 94, in forward
return self.predict(inputs, data_samples)
File "/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmdet/models/detectors/single_stage.py", line 110, in predict
results_list = self.bbox_head.predict(
File "/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmdet/models/dense_heads/base_dense_head.py", line 196, in predict
predictions = self.predict_by_feat(
File "/media/zr/Data/MMLAB_2.0/mmyolo/mmyolo/models/dense_heads/yolov5_head.py", line 406, in predict_by_feat
results = self._bbox_post_process(
File "/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmdet/models/dense_heads/base_dense_head.py", line 478, in _bbox_post_process
det_bboxes, keep_idxs = batched_nms(bboxes, results.scores,
File "/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmcv/ops/nms.py", line 334, in batched_nms
dets, keep = nms_op(boxes_for_nms, scores, nmscfg)
File "/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmengine/utils/misc.py", line 351, in new_func
output = old_func(args, kwargs)
File "/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmcv/ops/nms.py", line 159, in nms
inds = NMSop.apply(boxes, scores, iou_threshold, offset, score_threshold,
File "/media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmcv/ops/nms.py", line 27, in forward
inds = ext_module.nms(
RuntimeError: nms_impl: implementation for device cuda:0 not found.
Exception raised from Dispatch at /tmp/mmcv/mmcv/ops/csrc/common/pytorch_device_registry.hpp:122 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7fbae3043ab2 in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const, char const, unsigned int, std::string const&) + 0x5b (0x7fbae304014b in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/torch/lib/libc10.so) frame #2: nms_impl(at::Tensor, at::Tensor, float, int) + 0xa97 (0x7fb9b64e2847 in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmcv/_ext.cpython-310-x86_64-linux-gnu.so) frame #3: nms(at::Tensor, at::Tensor, float, int) + 0x4f (0x7fb9b64e2fcf in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmcv/_ext.cpython-310-x86_64-linux-gnu.so) frame #4: + 0x12515b (0x7fb9b652515b in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmcv/_ext.cpython-310-x86_64-linux-gnu.so)
frame #5: + 0x11224f (0x7fb9b651224f in /media/zr/Data/MMLAB_2.0/venv/lib/python3.10/site-packages/mmcv/_ext.cpython-310-x86_64-linux-gnu.so)