Closed zxx0809 closed 4 months ago
Hi @zxx0809 ,
In my environment, I use the versions as in the README.md
. For mmcv, you should install with:
TORCH_CUDA_ARCH_LIST="{COMCAP}" TORCH_NVCC_FLAGS="-Xfatbin -compress-all" CUDA_HOME=$(dirname $(dirname $(which nvcc))) LD_LIBRARY_PATH=$(dirname $(dirname $(which nvcc)))/lib MMCV_WITH_OPS=1 FORCE_CUDA=1 python -m pip install git+https://github.com/open-mmlab/mmcv.git@4f65f91db6502d990ce2ee5de0337441fb69dd10
If you want to use a newer version of mmcv, please update the versions of other packages (e.g., mmdet, mmpretrain).
You can use other versions of CUDA.
Please let me know if you have any other questions.
But When I executed this sentence‘TORCH_CUDA_ARCH_LIST="{COMCAP}" TORCH_NVCC_FLAGS="-Xfatbin -compress-all" CUDA_HOME=$(dirname $(dirname $(which nvcc))) LD_LIBRARY_PATH=$(dirname $(dirname $(which nvcc)))/lib MMCV_WITH_OPS=1 FORCE_CUDA=1 python -m pip install git+https://github.com/open-mmlab/mmcv.git@4f65f91db6502d990ce2ee5de0337441fb69dd10’, he reported the error ERROR: Failed building wheel for mmcv Running setup.py clean for mmcv Failed to build mmcv ERROR: Could not build wheels for mmcv, which is required to install pyproject.toml-based projects
You can post more details about the error information. If you want more simple installation (no compilation), you can try the versions in this file.
I guess your error may be due to the missing of the nvcc compiler.
Hi @zxx0809 ,
Please let me know if you are still having problems with the installation of the environment. However, due to the differences in different environments, I may not be able to solve all the problems regarding to installation.
Thank you for your generous reply. I have resolved the environmental issues and successfully run the project. Post my environment configuration. CUDA 11.8, Torch 2.0.1, and mmcv 2.0.1
Great!
Also, make sure the python version is updated. I had the same error because my python version was 3.6. Updating to 3.10 fixed the error.
According to your README, you have installed CUDA 12.1, but according to this website https://mmcv.readthedocs.io/en/latest/get_started/installation.html#install-with-pip , I should install PyTorch version 2.1.0 and mmcv 2.1.0. However, it seems that it is not meeting the requirement "Please install mmcv>=2.0.0, <2.1.0." Can you please tell me the correct environment configuration requirements? Thank you!!
When I execute "bash tools/dist.sh test seg/configs/sam2clip/sam_vith_dump.py 1", I get this error.
Traceback (most recent call last): File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/config/lazy.py", line 68, in build module = importlib.import_module(self._module) File "/root/miniconda3/envs/ovsam/lib/python3.10/importlib/init.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/workspace/ovsam/seg/models/detectors/init.py", line 1, in
from .sam2clip_distill import BackboneDistillation
File "/workspace/ovsam/seg/models/detectors/sam2clip_distill.py", line 6, in
from mmdet.models.detectors.base import ForwardResults
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmdet/models/init.py", line 3, in
from .data_preprocessors import * # noqa: F401,F403
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmdet/models/data_preprocessors/init.py", line 6, in
from .reid_data_preprocessor import ReIDDataPreprocessor
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmdet/models/data_preprocessors/reid_data_preprocessor.py", line 13, in
import mmpretrain
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmpretrain/init.py", line 18, in
and mmcv_version < digit_version(mmcv_maximum_version)), \
AssertionError: MMCV==2.1.0 is used but incompatible. Please install mmcv>=2.0.0, <2.1.0.
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/workspace/ovsam/tools/test.py", line 177, in
main()
File "/workspace/ovsam/tools/test.py", line 141, in main
runner = Runner.from_cfg(cfg)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/runner/runner.py", line 445, in from_cfg
runner = cls(
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/runner/runner.py", line 412, in init
self.model = self.build_model(model)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/runner/runner.py", line 819, in build_model
model = MODELS.build(model)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, *args, *kwargs, registry=self)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 96, in build_from_cfg
obj_type = args.pop('type')
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/config/config.py", line 182, in pop
return self.build_lazy(super().pop(key, default))
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/config/config.py", line 215, in build_lazy
value = value.build()
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/config/lazy.py", line 70, in build
raise type(e)(f'Failed to import {self._module} '
AssertionError: Failed to import seg.models.detectors in seg/configs/sam2clip/sam_vith_dump.py, line 5 for MMCV==2.1.0 is used but incompatible. Please install mmcv>=2.0.0, <2.1.0.
[2024-07-05 10:13:04,386] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 40539) of binary: /root/miniconda3/envs/ovsam/bin/python
Traceback (most recent call last):
File "/root/miniconda3/envs/ovsam/bin/torchrun", line 33, in
sys.exit(load_entry_point('torch==2.1.0', 'console_scripts', 'torchrun')())
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f( args, **kwargs)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: