NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.37k stars 1.39k forks source link

RuntimeError: apex.optimizers.FusedAdam requires cuda extensions #1193

Open life97 opened 3 years ago

life97 commented 3 years ago

My environment is configured as Windows server2016, torch 1.8.1, torchvision 0.9.1, cuda10.2, apex is successfully installed, but when running the project code (NVlabs/imagenaire), an error is reported:

Initialize net_G and net_D weights using type: orthogonal gain: 1 net_G parameter count: 30,258,966 net_D parameter count: 32,322,498 Traceback (most recent call last): File "H:\19xyy\project\imaginaire-master\train.py", line 100, in main() File "H:\19xyy\project\imaginaire-master\train.py", line 60, in main get_model_optimizer_and_scheduler(cfg, seed=args.seed) File "H:\19xyy\project\imaginaire-master\imaginaire\utils\trainer.py", line 115, in get_model_optimizer_and_scheduler opt_G = get_optimizer(cfg.gen_opt, net_G) File "H:\19xyy\project\imaginaire-master\imaginaire\utils\trainer.py", line 257, in get_optimizer return get_optimizer_for_params(cfg_opt, params) File "H:\19xyy\project\imaginaire-master\imaginaire\utils\trainer.py", line 274, in get_optimizer_for_params opt = FusedAdam(params, File "G:\Anaconda3\envs\xyy_imagenaire\lib\site-packages\apex\optimizers\fused_adam.py", line 80, in init raise RuntimeError('apex.optimizers.FusedAdam requires cuda extensions') RuntimeError: apex.optimizers.FusedAdam requires cuda extensions

The versions of nvcc -V and print(torch.version.cuda) are the same. I don’t know why this error is reported. Are there any good suggestions to make the code run correctly? Looking forward to your reply, thank you very much! 161 162

Dawn-bin commented 3 years ago

hi, i meet the same problem, has it happen before? it's my first time use this optimizer. my env this ubuntu 18.0.4 torch 1.8.0 cuda 11.1

life97 commented 3 years ago

hi, i meet the same problem, has it happen before? it's my first time use this optimizer. my env this ubuntu 18.0.4 torch 1.8.0 cuda 11.1

Sorry, I haven't solved this problem yet. I use imagenaire mainly because I want to run the MUNIT model, and I can run through the official code before, so I didn't continue to solve this problem.

suxin1412 commented 2 years ago

hi, i meet the same problem, has it happen before? it's my first time use this optimizer. my env this ubuntu 18.0.4 torch 1.8.0 cuda 11.1

hi, i meet the same problem. Have you solved this problem?

Dawn-bin commented 2 years ago

hi, i meet the same problem, has it happen before? it's my first time use this optimizer. my env this ubuntu 18.0.4 torch 1.8.0 cuda 11.1

hi, i meet the same problem. Have you solved this problem?

yeah, it sames like that apex is installed on only cpu, you can solve this trying to reinstall apex CUDA contained follow the readme. hope it works.

kongyuzhuo commented 2 years ago

hi, i have meet the same problem, have u solved the problem?

Chiang97912 commented 1 year ago

This is because of apex cannot import amp_C,you can check the file "G:\Anaconda3\envs\xyy_imagenaire\lib\site-packages\apex\optimizers\fused_adam.py", also you can use your python shell to verify this:

import torch
import amp_C  # must import torch before import amp_C

Maybe you can get error like: libstdc++.so.6: version 'GLIBCXX_3.4.20' not found', If so, you can try the following commands:

conda install libgcc
export LD_LIBRARY_PATH=/path/to/anaconda/envs/myenv/lib:$LD_LIBRARY_PATH
cd /path/to/anaconda/envs/myenv/lib
ln -s libstdc++.so.6.0.30 libstdc++.so.6

And you can add export LD_LIBRARY_PATH=/path/to/anaconda/envs/myenv/lib:$LD_LIBRARY_PATH to ~/.bashrc file.

huang-zeyu commented 1 year ago

Some error. Not solved yet. Ubuntu-20.04(WSL2) python3.9 cuda116 cudnn850 torch1.12.1 following the readme installation. Btw, import torch then import amp_C also failed. Hope someone can fix it or provide a solution.

GuangmingChan commented 1 year ago

I have also experienced this error: I had successfully installed Apex in a certain environment before, but when I switched to a different environment and tried to reinstall Apex, it appeared to install successfully, but when running the code, it always gave the error "RuntimeError: apex.optimizers.FusedAdam requires cuda extensions". Later, I deleted the Apex folder downloaded from GitHub, downloaded it again, and reinstalled Apex. In the end, it was successfully executed.

ShoufaChen commented 1 year ago

I solved this problem by building with

pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./

rather than

pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

My pip version is 22.3.1.

nshah-sfoundation commented 1 year ago

I have installed the apex with the below command. but still getting the error RuntimeError: apex.optimizers.FusedAdam requires cuda extensions Linux 5.15.120.2 cuda 11.8 pip 23 torch cu118

pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
filipesmg commented 1 year ago

I get the same issue using pip 22.0.4, and the command pointed on the README:

# otherwise
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

I noticed that even with the command above, --cpp_ext and --cuda_ext are not in sys.argv that reaches setup.py (and that is what seems to be checked):

['(...)/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py', 'dist_info', '--egg-base', '/tmp/pip-modern-metadata-pvaz06q9']
Tolga-Karahan commented 1 year ago

Anyone who solved this issue?

frankielp commented 1 year ago

this solution works in my case https://github.com/NVIDIA/apex/issues/1204#issuecomment-1659884672

Tolga-Karahan commented 1 year ago

@frankielp thanks. I tried but got another error: ninja: error: '/app/csrc/amp_C_frontend.cpp', needed by '/app/build/temp.linux-x86_64-cpython-310/csrc/amp_C_frontend.o', missing and no known rule to make it. I'll create an issue for that.

WuHongyuQXWX commented 1 year ago

This is because of apex cannot import amp_C,you can check the file "G:\Anaconda3\envs\xyy_imagenaire\lib\site-packages\apex\optimizers\fused_adam.py", also you can use your python shell to verify this:

import torch
import amp_C  # must import torch before import amp_C

Maybe you can get error like: libstdc++.so.6: version 'GLIBCXX_3.4.20' not found', If so, you can try the following commands:

conda install libgcc
export LD_LIBRARY_PATH=/path/to/anaconda/envs/myenv/lib:$LD_LIBRARY_PATH
cd /path/to/anaconda/envs/myenv/lib
ln -s libstdc++.so.6.0.30 libstdc++.so.6

And you can add export LD_LIBRARY_PATH=/path/to/anaconda/envs/myenv/lib:$LD_LIBRARY_PATH to ~/.bashrc file.

Finally solve my problem, you are so fucking bralliant Bro! 老哥真nb

Flame-circle commented 7 months ago

I solved this problem by building with

pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./

rather than

pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

My pip version is 22.3.1.

THANK YOU VERY MUCH,IT IS HELPFUL

mkerin commented 5 months ago

I solved this problem by building with

pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./

rather than

pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

My pip version is 22.3.1.

This solution didn't work for me (on pip 24.0), and instead I had to use https://github.com/NVIDIA/apex/issues/1204#issuecomment-1659884672