microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
https://www.deepspeed.ai/
Apache License 2.0
35.34k stars 4.1k forks source link

[BUG] whl does not get created following the instructions on Windows 11 #3196

Closed acube3 closed 1 year ago

acube3 commented 1 year ago

build_win.bat Administrative permissions required. Detecting permissions... Success: Administrative permissions confirmed. NOTE: Redirects are currently not supported in Windows or MacOs. DS_BUILD_OPS=1 test.c LINK : fatal error LNK1181: cannot open input file 'aio.lib' [WARNING] async_io requires the dev libaio .so object and headers but these were not found. [WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. [WARNING] please install triton==1.0.0 if you want to use sparse attention Install Ops={'async_io': False, 'cpu_adagrad': 1, 'cpu_adam': 1, 'fused_adam': 1, 'fused_lamb': 1, 'quantizer': 1, 'random_ltd': 1, 'sparse_attn': False, 'spatial_inference': 1, 'transformer': 1, 'stochastic_transformer': 1, 'transformer_inference': 1, 'utils': 1} version=0.9.0+f662bfcd, git_hash=f662bfcd, git_branch=master install_requires=['hjson', 'ninja', 'numpy', 'packaging>=20.0', 'psutil', 'py-cpuinfo', 'pydantic', 'torch', 'tqdm'] compatible_ops={'async_io': False, 'cpu_adagrad': True, 'cpu_adam': True, 'fused_adam': True, 'fused_lamb': True, 'quantizer': True, 'random_ltd': True, 'sparse_attn': False, 'spatial_inference': True, 'transformer': True, 'stochastic_transformer': True, 'transformer_inference': True, 'utils': True} ext_modules=[<setuptools.extension.Extension('deepspeed.ops.adagrad.cpu_adagrad_op') at 0x2c2fb8d69a0>, <setuptools.extension.Extension('deepspeed.ops.adam.cpu_adam_op') at 0x2c2dcd708b0>, <setuptools.extension.Extension('deepspeed.ops.adam.fused_adam_op') at 0x2c2fe9d76a0>, <setuptools.extension.Extension('deepspeed.ops.lamb.fused_lamb_op') at 0x2c2fe9d74f0>, <setuptools.extension.Extension('deepspeed.ops.quantizer.quantizer_op') at 0x2c2fe9d7490>, <setuptools.extension.Extension('deepspeed.ops.random_ltd_op') at 0x2c2fe9d7760>, <setuptools.extension.Extension('deepspeed.ops.spatial.spatial_inference_op') at 0x2c2fe9d7820>, <setuptools.extension.Extension('deepspeed.ops.transformer.transformer_op') at 0x2c2fe9d7520>, <setuptools.extension.Extension('deepspeed.ops.transformer.stochastic_transformer_op') at 0x2c2fe9d7910>, <setuptools.extension.Extension('deepspeed.ops.transformer.inference.transformer_inference_op') at 0x2c2fe9d7880>, <setuptools.extension.Extension('deepspeed.ops.utils_op') at 0x2c2fe9d7670>] running bdist_wheel running build running build_py copying deepspeed\git_version_info_installed.py -> build\lib.win-amd64-cpython-38\deepspeed running egg_info writing deepspeed.egg-info\PKG-INFO writing dependency_links to deepspeed.egg-info\dependency_links.txt writing entry points to deepspeed.egg-info\entry_points.txt writing requirements to deepspeed.egg-info\requires.txt writing top-level names to deepspeed.egg-info\top_level.txt reading manifest file 'deepspeed.egg-info\SOURCES.txt' reading manifest template 'MANIFEST_win.in' warning: no previously-included files matching '.cpp' found under directory 'deepspeed\ops\csrc' warning: no previously-included files matching '.h' found under directory 'deepspeed\ops\csrc' warning: no previously-included files matching '.cu' found under directory 'deepspeed\ops\csrc' warning: no previously-included files matching '.cuh' found under directory 'deepspeed\ops\csrc' warning: no previously-included files matching '*.cc' found under directory 'deepspeed\ops\csrc' no previously-included directories found matching 'op_builder' no previously-included directories found matching 'accelerator' adding license file 'LICENSE' writing manifest file 'deepspeed.egg-info\SOURCES.txt' running build_ext building 'deepspeed.ops.quantizer.quantizer_op' extension "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\nvcc" -c csrc/quantization/dequantize.cu -o build\temp.win-amd64-cpython-38\Release\csrc/quantization/dequantize.obj -Icsrc/includes -IC:\Users\amolamb\Anaconda3\envs\asuka\lib\site-packages\torch\include -IC:\Users\amolamb\Anaconda3\envs\asuka\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\amolamb\Anaconda3\envs\asuka\lib\site-packages\torch\include\TH -IC:\Users\amolamb\Anaconda3\envs\asuka\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\include" -IC:\Users\amolamb\Anaconda3\envs\asuka\include -IC:\Users\amolamb\Anaconda3\envs\asuka\Include "-Ic:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\cppwinrt" -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --

. . . . . . . . . . . . . . . C:\Users\amolamb\Anaconda3\envs\asuka\lib\site-packages\torch\include\pybind11\detail/common.h(108): warning C4005: 'HAVE_SNPRINTF': macro redefinition C:\Users\amolamb\Anaconda3\envs\asuka\include\pyerrors.h(315): note: see previous definition of 'HAVE_SNPRINTF' "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\nvcc" -c csrc/quantization/quantize.cu -o build\temp.win-amd64-cpython-38\Release\csrc/quantization/quantize.obj -Icsrc/includes -IC:\Users\amolamb\Anaconda3\envs\asuka\lib\site-packages\torch\include -IC:\Users\amolamb\Anaconda3\envs\asuka\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\amolamb\Anaconda3\envs\asuka\lib\site-packages\torch\include\TH -IC:\Users\amolamb\Anaconda3\envs\asuka\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\include" -IC:\Users\amolamb\Anaconda3\envs\asuka\include -IC:\Users\amolamb\Anaconda3\envs\asuka\Include "-Ic:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\cppwinrt" -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -O3 -allow-unsupported-compiler --use_fast_math -std=c++17 -UCUDA_NO_HALF_OPERATORS -UCUDA_NO_HALF_CONVERSIONS -UCUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=quantizer_op -D_GLIBCXX_USE_CXX11_ABI=0 --use-local-env quantize.cu cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF_OPERATORS' with '/UCUDA_NO_HALF_OPERATORS' cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF_CONVERSIONS' with '/UCUDA_NO_HALF_CONVERSIONS' cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF2_OPERATORS' with '/UCUDA_NO_HALF2_OPERATORS' quantize.cu cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF_OPERATORS' with '/UCUDA_NO_HALF_OPERATORS' cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF_CONVERSIONS' with '/UCUDA_NO_HALF_CONVERSIONS' cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF2_OPERATORS' with '/UCUDA_NO_HALF2_OPERATORS' quantize.cu cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF_OPERATORS' with '/UCUDA_NO_HALF_OPERATORS' cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF_CONVERSIONS' with '/UCUDA_NO_HALF_CONVERSIONS' cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF2_OPERATORS' with '/UCUDA_NO_HALF2_OPERATORS' quantize.cu cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF_OPERATORS' with '/UCUDA_NO_HALF_OPERATORS' cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF_CONVERSIONS' with '/UCUDA_NO_HALF_CONVERSIONS' cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF2_OPERATORS' with '/UCUDA_NO_HALF2_OPERATORS' quantize.cu cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF_OPERATORS' with '/UCUDA_NO_HALF_OPERATORS' cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF_CONVERSIONS' with '/UCUDA_NO_HALF_CONVERSIONS' cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF2_OPERATORS' with '/UCUDA_NO_HALF2_OPERATORS' quantize.cu cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF_OPERATORS' with '/UCUDA_NO_HALF_OPERATORS' cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF_CONVERSIONS' with '/UCUDA_NO_HALF_CONVERSIONS' cl : Command line warning D9025 : overriding '/DCUDA_NO_HALF2_OPERATORS' with '/UCUDA_NO_HALF2_OPERATORS__' quantize.cu csrc/includes\memory_access_utils.h(82): error: identifier "uint32_t" is undefined

csrc/includes\memory_access_utils.h(359): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(359): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(359): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(363): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(363): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(371): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(371): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(371): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(383): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(383): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(395): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(395): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(395): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(399): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(399): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(409): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(409): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(409): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(421): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(421): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(434): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(434): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(434): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(438): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(438): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(448): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(448): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(448): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(460): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(460): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(472): error: identifier "int16_t" is undefined

csrc/includes\memory_access_utils.h(472): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(472): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(476): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(476): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(484): error: identifier "int16_t" is undefined

csrc/includes\memory_access_utils.h(484): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(484): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(496): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(496): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(508): error: identifier "int16_t" is undefined

csrc/includes\memory_access_utils.h(508): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(508): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(512): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(512): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(522): error: identifier "int16_t" is undefined

csrc/includes\memory_access_utils.h(522): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(522): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(534): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(534): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(547): error: identifier "int16_t" is undefined

csrc/includes\memory_access_utils.h(547): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(547): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(551): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(551): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(561): error: identifier "int16_t" is undefined

csrc/includes\memory_access_utils.h(561): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(561): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(573): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(573): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(703): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(703): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(703): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(709): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(709): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(717): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(717): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(717): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(731): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(731): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(836): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(836): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(840): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(840): error: identifier "dst_cast" is undefined

csrc/includes\memory_access_utils.h(840): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(849): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(849): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(853): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(853): error: identifier "dst_cast" is undefined

csrc/includes\memory_access_utils.h(853): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(862): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(862): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(866): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(866): error: identifier "dst_cast" is undefined

csrc/includes\memory_access_utils.h(866): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(908): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(908): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(914): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(914): error: identifier "dst_cast" is undefined

csrc/includes\memory_access_utils.h(914): error: "int32_t" is not a type name

91 errors detected in the compilation of "csrc/quantization/quantize.cu". error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\nvcc.exe' failed with exit code 2

loadams commented 1 year ago

Hi @acube3 - it looks like one of your errors if from trying to install aio, which is not supported on Windows. Can you share the commands you're running to get to this point, and the environment you have?

abc20220327 commented 1 year ago

i have same problem with you @acube3 , did you reslove ?

loadams commented 1 year ago

@abc20220327 and @acube3, can you please confirm that you have VS2019 C++ x64/x86 build tools, cuda, and torch installed? And the versions? Then once you've done that, can you run the following to confirm they're installed and found by your python instance?

python -c "import torch; print('torch:', torch.__version__, torch)"
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"

I was just able to follow the steps listed on a Windows 11 machine and get it to build.

xfuzq707 commented 1 year ago

i have same problem , this is my torch `PS F:\DeepSpeed-0.9.1> python -c "import torch; print('torch:', torch.version, torch)" torch: 1.13.0+cu116 <module 'torch' from 'F:\ProgramData\Anaconda3\lib\site-packages\torch\init.py'>

PS F:\DeepSpeed-0.9.1> python -c "import torch; print('CUDA available:', torch.cuda.is_available())" CUDA available: True`

loadams commented 1 year ago

@xfuzq707 - thanks, those should be fine. Can you confirm the commands you are running, and the full error message?

shaolongcai commented 1 year ago

me too

/* (base) PS E:\deepspeed> python -c "import torch; print('torch:', torch.version, torch)" torch: 2.0.0+cu117 <module 'torch' from 'E:\Anaconda3\lib\site-packages\torch\init.py'> (base) PS E:\deepspeed> python -c "import torch; print('CUDA available:', torch.cuda.is_available())" CUDA available: True /

all is ready,but i have same problem. cannot open input file 'aio.lib'

loadams commented 1 year ago

@shaolongcai - aio isn't supported on Windows. If using the build_win.bat script, this is explicitly set - how are you building on Windows? If you want/need aio functionality, you'd be better using WSL on your Windows machine if you can.

xfuzq707 commented 1 year ago

@xfuzq707 - thanks, those should be fine. Can you confirm the commands you are running, and the full error message?

Yes, there are all message:

`

PS F:\DeepSpeed-0.9.1> .\build_win.bat Administrative permissions required. Detecting permissions... Success: Administrative permissions confirmed. DS_BUILD_OPS=1 test.c LINK : fatal error LNK1181: 无法打开输入文件“aio.lib” [WARNING] async_io requires the dev libaio .so object and headers but these were not found. [WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. 系统找不到指定的文件。 [WARNING] cpu_adagrad requires the 'lscpu' command, but it does not exist! [WARNING] cpu_adagrad attempted to query 'lscpu' after failing to use py-cpuinfo to detect the CPU architecture. 'lscpu' does not appear to exist on your system, will fall back to use -march=native and non-vectorized execution. 系统找不到指定的文件。 [WARNING] cpu_adagrad requires the 'lscpu' command, but it does not exist! [WARNING] cpu_adagrad attempted to query 'lscpu' after failing to use py-cpuinfo to detect the CPU architecture. 'lscpu' does not appear to exist on your system, will fall back to use -march=native and non-vectorized execution. 系统找不到指定的文件。 [WARNING] cpu_adam requires the 'lscpu' command, but it does not exist! [WARNING] cpu_adam attempted to query 'lscpu' after failing to use py-cpuinfo to detect the CPU architecture. 'lscpu' does not appear to exist on your system, will fall back to use -march=native and non-vectorized execution. 系统找不到指定的文件。 [WARNING] cpu_adam requires the 'lscpu' command, but it does not exist! [WARNING] cpu_adam attempted to query 'lscpu' after failing to use py-cpuinfo to detect the CPU architecture. 'lscpu' does not appear to exist on your system, will fall back to use -march=native and non-vectorized execution. [WARNING] please install triton==1.0.0 if you want to use sparse attention Install Ops={'async_io': False, 'cpu_adagrad': 1, 'cpu_adam': 1, 'fused_adam': 1, 'fused_lamb': 1, 'quantizer': 1, 'random_ltd': 1, 'sparse_attn': False, 'spatial_inference': 1, 'transformer': 1, 'stochastic_transformer': 1, 'transformer_inference': 1, 'utils': 1} fatal: not a git repository (or any of the parent directories): .git version=0.9.1+unknown, git_hash=unknown, git_branch=unknown install_requires=['hjson', 'ninja', 'numpy', 'packaging>=20.0', 'psutil', 'py-cpuinfo', 'pydantic<2.0.0', 'torch', 'tqdm']

........

quantize.cu cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_OPERATORS”(用“/UCUDA_NO_HALF_OPERATORS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_CONVERSIONS”(用“/UCUDA_NO_HALF_CONVERSIONS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF2_OPERATORS”(用“/UCUDA_NO_HALF2_OPERATORS”) quantize.cu cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_OPERATORS”(用“/UCUDA_NO_HALF_OPERATORS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_CONVERSIONS”(用“/UCUDA_NO_HALF_CONVERSIONS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF2_OPERATORS”(用“/UCUDA_NO_HALF2_OPERATORS”) quantize.cu cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_OPERATORS”(用“/UCUDA_NO_HALF_OPERATORS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_CONVERSIONS”(用“/UCUDA_NO_HALF_CONVERSIONS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF2_OPERATORS”(用“/UCUDA_NO_HALF2_OPERATORS”) quantize.cu cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_OPERATORS”(用“/UCUDA_NO_HALF_OPERATORS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_CONVERSIONS”(用“/UCUDA_NO_HALF_CONVERSIONS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF2_OPERATORS”(用“/UCUDA_NO_HALF2_OPERATORS”) quantize.cu cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_OPERATORS”(用“/UCUDA_NO_HALF_OPERATORS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_CONVERSIONS”(用“/UCUDA_NO_HALF_CONVERSIONS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF2_OPERATORS”(用“/UCUDA_NO_HALF2_OPERATORS”) quantize.cu cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_OPERATORS”(用“/UCUDA_NO_HALF_OPERATORS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_CONVERSIONS”(用“/UCUDA_NO_HALF_CONVERSIONS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF2_OPERATORS”(用“/UCUDA_NO_HALF2_OPERATORS”) quantize.cu csrc/includes\memory_access_utils.h(82): error: identifier "uint32_t" is undefined

csrc/includes\memory_access_utils.h(359): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(359): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(359): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(363): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(363): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(371): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(371): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(371): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(383): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(383): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(395): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(395): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(395): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(399): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(399): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(409): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(409): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(409): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(421): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(421): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(434): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(434): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(434): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(438): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(438): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(448): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(448): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(448): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(460): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(460): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(472): error: identifier "int16_t" is undefined

csrc/includes\memory_access_utils.h(472): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(472): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(476): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(476): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(484): error: identifier "int16_t" is undefined

csrc/includes\memory_access_utils.h(484): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(484): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(496): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(496): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(508): error: identifier "int16_t" is undefined

csrc/includes\memory_access_utils.h(508): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(508): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(512): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(512): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(522): error: identifier "int16_t" is undefined

csrc/includes\memory_access_utils.h(522): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(522): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(534): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(534): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(547): error: identifier "int16_t" is undefined

csrc/includes\memory_access_utils.h(547): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(547): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(551): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(551): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(561): error: identifier "int16_t" is undefined

csrc/includes\memory_access_utils.h(561): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(561): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(573): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(573): error: "int16_t" is not a type name

csrc/includes\memory_access_utils.h(703): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(703): error: identifier "data" is undefined

csrc/includes\memory_access_utils.h(703): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(709): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(836): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(836): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(840): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(840): error: identifier "dst_cast" is undefined

csrc/includes\memory_access_utils.h(840): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(849): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(849): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(853): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(853): error: identifier "dst_cast" is undefined

csrc/includes\memory_access_utils.h(853): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(862): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(862): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(866): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(866): error: identifier "dst_cast" is undefined

csrc/includes\memory_access_utils.h(866): error: "int32_t" is not a type name

csrc/includes\memory_access_utils.h(908): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(908): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(914): error: identifier "int32_t" is undefined

csrc/includes\memory_access_utils.h(914): error: identifier "dst_cast" is undefined

csrc/includes\memory_access_utils.h(914): error: "int32_t" is not a type name

91 errors detected in the compilation of "csrc/quantization/quantize.cu". error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\nvcc.exe' failed with exit status 2 PS F:\DeepSpeed-0.9.1> python -c "import torch; print('torch:', torch.version, torch)" torch: 1.13.0+cu116 <module 'torch' from 'F:\ProgramData\Anaconda3\lib\site-packages\torch\init.py'>

`

loadams commented 1 year ago

@xfuzq707 - do you have the Visual CPP build tools installed?

loadams commented 1 year ago

Closing this issue as its been a week and unclear which users are still hitting specific issues. Please re-open as needed or create a new, specific issue here.

ain-soph commented 1 year ago

Just searching for an issue and see this one by coincidence.
From the log, it seems cl has been installed, but several identifiers are not defined:

error: "int16_t" is not a type name

I guess a potential fix might be to install appropriate Windows 10 SDK in the visual studio installer for build tools, which contains those header files and corresponding libraries.

xyc0123456789 commented 12 months ago

After configuring visual studio, Torch, and CUDA, I passed the compilation by adding the following environment variables to the environment settings at the beginning of the build_win.bat."

set DS_BUILD_EVOFORMER_ATTN=0
set DS_BUILD_OPS=0