NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.42k stars 1.4k forks source link

Win11+Visual Studio 2022,install successfully. #1809

Open aswordok opened 5 months ago

aswordok commented 5 months ago

win11 Python 3.11.8 CUDA Ver: 11.8 torch Ver: 2.3.0+cu118 torchvision Ver: 0.18.0+cu118 torchaudio Ver: 2.3.0+cu118 cuDNN Ver: 8700 Visual Studio 2022 安装 使用 C++ 桌面开发

参考:https://github.com/NVIDIA/apex/issues/835#issuecomment-646112354

cd D:+AI\ComfyUI\ComfyUI_windows_portable\python_embeded git clone https://github.com/NVIDIA/apex cd apex git checkout 2ec84eb ..\python -m pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" . Successfully installed apex-0.1 Error: raw_output = subprocess.check_output([cuda_dir + "/bin/nvcc", "-V"], universal_newlines=True) FileNotFoundError: [WinError 2] 系统找不到指定的文件。 Solved: 这里需要注意的是,在安装pytorch时会自动装上CUDA和cuDNN包并且会劫持调用,我们仍要在系统中安装上相同版本的CUDA和cuDNN。 set | findstr CUDA 注意CUDA_DIR要大写:set当前窗生效,setx写入环境变量,重新开窗仍有效 set CUDA_DIR="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8" setx /m CUDA_DIR "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8" 验证: ..\python.exe -m pip list | findstr apex ..\python -c "from apex import amp" Error: No module named 'torch._six' Solved: cmd运行:(只能在同盘符中运行)(windows下如果安装了git是有sed这个命令的,把 C:\Program Files\Git\usr\bin 添加到系统变量Path中) sed -i 's/from torch._six import container_abcs/import collections.abc as container_abcs/' "D:+AI\ComfyUI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\apex\amp_amp_state.py" sed -i 's/from torch._six import string_classes/string_classes = str/' "D:+AI\ComfyUI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\apex\amp_initialize.py" 再次验证: ..\python -c "import apex;import importlib;global fused_layer_norm_cuda;fused_layer_norm_cuda = importlib.import_module('fused_layer_norm_cuda')" 这是从comfyUI中的apex提炼出来的测试代码,没有错误提示则是成功。 Error:No module named 'fused_layer_norm_cuda' Solved: 原因是安装apex时没有安装cuda版本,--cuda_ext 安装时必须带上。 经测试 comfyUI 运行正常。

doctorpangloss commented 5 months ago

https://github.com/NVIDIA/apex/issues/1792

you have to follow my instructions from here, and also add fused layer norm to your build options. You will not succeed in using those nodes that say they are for Linux in Windows right now. Use WSL.

aswordok commented 4 months ago

@doctorpangloss wsl is not necessary. if you get 'fused_layer_norm_cuda' error, just build with --cuda_ext.