Closed RedHeartSecretMan closed 1 month ago
I'm also encountering this issue, checked the makefile syntax, but found no problems, which is strange.
same problem here, any progress? I tried to modify Makefile to run the error caused 'nvcc ***' command without '-o ggml/src/ggml-cuda.o', this error is gone but it will further encounter other problems, like there be spaces in include dir specified by -I, which nvcc cannot handle.
it would seems to me that building llama.cpp on Windows with CUDA would need a difference environment with pure CPU
As far as I know, CUDA under windows is only supported with MSVC.
As far as I know, CUDA under windows is only supported with MSVC.
yes, I assume that means I could only compile CUDA within Visual Studio IDE? I did successfully build llama.cpp CUDA using VS.
This issue was closed because it has been inactive for 14 days since being marked as stale.
What happened?
Description: I am attempting to compile the llama.cpp project with CUDA support enabled (GGML_CUDA=1) on a Windows system using MinGW. I have set the CUDA_DOCKER_ARCH environment variable as per the requirements, but I am encountering a compilation error related to nvcc.
Steps to Reproduce:
- Set the CUDA_DOCKER_ARCH environment variable:
export CUDA_DOCKER_ARCH=compute_89
- Run the make command:
make GGML_CUDA=1
Expected Behavior: The project should compile successfully with CUDA support enabled.
Details about my terminal environment: W:\Code\CCpp\FunnyProject\llama.cpp ❯ w64devkit ~ $ printenv SCOOP=J:/Software/Scoop CONDA_PROMPT_MODIFIER=False HTTPS_PROXY=http://127.0.0.1:7890 PROGRAMFILESX86=C:/Program Files (x86) USER=WangHao LOGONSERVER=//WANGHAOITX PROGRAMFILES=C:/Program Files ALLUSERSPROFILE=C:/ProgramData POSH_THEMES_PATH=C:/Users/WangHao/AppData/Local/Programs/oh-my-posh/themes PROGRAMW6432=C:/Program Files POWERLINE_COMMAND=oh-my-posh WT_PROFILE_ID={574e775e-4f2a-5b96-ac1e-a2962a402336} SHLVL=1 HOME=C:/Users/WangHao CONDA_SHLVL=1 POSH_GIT_ENABLED=False SYSTEMDRIVE=C: ProgramFiles(x86)=C:\Program Files (x86) POWERSHELL_DISTRIBUTION_CHANNEL=MSI:Windows 10 Pro PROCESSOR_IDENTIFIER=AMD64 Family 25 Model 97 Stepping 2, AuthenticAMD SSL_CERT_FILE=J:/Software/Scoop/apps/miniconda/current/envs/tensor/Library/ssl/cacert.pem PROCESSOR_REVISION=6102 PUBLIC=C:/Users/Public W64DEVKIT=1.23.0 _CONDA_ROOT=J:/Software/Scoop/apps/miniconda/current USERDOMAIN=WANGHAOITX POSH_AZURE_ENABLED=False PROCESSOR_ARCHITECTURE=AMD64 PSMODULEPATH=C:/Users/WangHao/Documents/PowerShell/Modules;C:/Program Files/PowerShell/Modules;c:/program files/powershell/7/Modules;C:/Program Files/WindowsPowerShell/Modules;C:/Windows/system32/WindowsPowerShell/v1.0/Modules W64DEVKIT_HOME=J:/Software/MinGW LOGNAME=WangHao COMMONPROGRAMFILESX86=C:/Program Files (x86)/Common Files TEMP=C:/Users/WangHao/AppData/Local/Temp COMMONPROGRAMFILES=C:/Program Files/Common Files USERNAME=WangHao COMMONPROGRAMW6432=C:/Program Files/Common Files LOCALAPPDATA=C:/Users/WangHao/AppData/Local POSH_SHELL_VERSION=7.4.2 SESSIONNAME=Console WINDIR=C:/Windows PATH=J:/Software/MinGW/bin;J:/Software/Scoop/apps/openjdk22/current/bin;J:/Software/Scoop/apps/miniconda/current/envs/tensor;J:/Software/Scoop/apps/miniconda/current/envs/tensor/Library/mingw-w64/bin;J:/Software/Scoop/apps/miniconda/current/envs/tensor/Library/usr/bin;J:/Software/Scoop/apps/miniconda/current/envs/tensor/Library/bin;J:/Software/Scoop/apps/miniconda/current/envs/tensor/Scripts;J:/Software/Scoop/apps/miniconda/current/envs/tensor/bin;J:/Software/Scoop/apps/miniconda/current/condabin;C:/Program Files/PowerShell/7;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/bin;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/libnvvp;C:/Windows/system32;C:/Windows;C:/Windows/System32/Wbem;C:/Windows/System32/WindowsPowerShell/v1.0;C:/Windows/System32/OpenSSH;C:/Program Files (x86)/NVIDIA Corporation/PhysX/Common;C:/Program Files/NVIDIA Corporation/Nsight Compute 2023.3.0;C:/Program Files/dotnet;C:/Program Files/PowerShell/7;J:/Software/Scoop/shims;C:/Users/WangHao/AppData/Local/Microsoft/WindowsApps;J:/Software/VSCode/bin;C:/Users/WangHao/AppData/Local/Programs/oh-my-posh/bin;C:/Users/WangHao/.dotnet/tools;J:/Software/LLVM;J:/Software/LLVM/bin;J:/Software/MinGW;J:/Software/MinGW/bin;J:/Software/RuntimeLibrary/libtorch/2.4.0/build;J:/Software/RuntimeLibrary/openblas/0.3.27/build;J:/Software/RuntimeLibrary/opencv/4.8.0/build SCOOP_GLOBAL=J:/Software/Scoop OS=Windows_NT WT_SESSION=68881e97-3db7-4fea-a4d4-75c276bd15ec NUMBER_OF_PROCESSORS=16 POSH_CURSOR_LINE=4 USERPROFILE=C:/Users/WangHao TMP=C:/Users/WangHao/AppData/Local/Temp APPDATA=C:/Users/WangHao/AppData/Roaming CONDA_PYTHON_EXE=J:/Software/Scoop/apps/miniconda/current/python.exe SHELL=/bin/sh PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.CPL CommonProgramFiles(x86)=C:\Program Files (x86)\Common Files CONDA_DEFAULT_ENV=tensor BB_GLOBBING=0 ONEDRIVECONSUMER=C:/Users/WangHao/OneDrive PROGRAMDATA=C:/ProgramData SYSTEMROOT=C:\Windows USERDOMAIN_ROAMINGPROFILE=WANGHAOITX _CONDA_EXE=J:/Software/Scoop/apps/miniconda/current/Scripts/conda.exe __CONDA_OPENSLL_CERT_FILE_SET=1 NVTOOLSEXT_PATH=C:/Program Files/NVIDIA Corporation/NvToolsExt/ POSH_THEME=C:/Users/WangHao/AppData/Local/Programs/oh-my-posh/themes/peru.omp.json HOMEDRIVE=C: JAVA_HOME=J:/Software/Scoop/apps/openjdk22/current POSH_CURSOR_COLUMN=1 PWD=C:/Users/WangHao COMPUTERNAME=WANGHAOITX COMSPEC=C:\Windows\system32\cmd.exe CONDA_EXE=J:/Software/Scoop/apps/miniconda/current/Scripts/conda.exe CUDA_PATH_V12_3=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3 GIT_INSTALL_ROOT=J:/Software/Scoop/apps/git/current HOMEPATH=/Users/WangHao HTTP_PROXY=http://127.0.0.1:7890 ONEDRIVE=C:/Users/WangHao/OneDrive POSH_INSTALLER=winget CONDA_PREFIX=J:/Software/Scoop/apps/miniconda/current/envs/tensor DRIVERDATA=C:/Windows/System32/Drivers/DriverData CUDA_PATH=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3 PROCESSOR_LEVEL=25 POSH_PID=144612 WSLENV=WT_SESSION:WT_PROFILE_ID:
Name and Version
llama.cpp e54c35e Operating System: Windows 10 Compiler: GCC 14.1.0 CUDA Version: 12.3 GPU: NVIDIA GeForce RTX 4080
What operating system are you seeing the problem on?
Windows
Relevant log output
The compilation process fails with the following error message: W:/Code/CCpp/FunnyProject/llama.cpp $ make GGML_CUDA=1 I ccache not found. Consider installing it for faster compilation. I llama.cpp build info: I UNAME_S: Windows_NT I UNAME_P: unknown I UNAME_M: x86_64 I CFLAGS: -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -D_XOPEN_SOURCE=600 -DNDEBUG -D_WIN32_WINNT=0x602 -DGGML_USE_OPENMP -DGGML_USE_LLAMAFILE -DGGML_USE_CUDA -IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/include -IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/targets/x86_64-linux/include -DGGML_CUDA_USE_GRAPHS -std=c11 -fPIC -O3 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -march=native -mtune=native -Xassembler -muse-unaligned-vector-move -fopenmp -Wdouble-promotion I CXXFLAGS: -std=c++11 -fPIC -O3 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Xassembler -muse-unaligned-vector-move -fopenmp -march=native -mtune=native -Wno-array-bounds -Wno-format-truncation -Wextra-semi -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -D_XOPEN_SOURCE=600 -DNDEBUG -D_WIN32_WINNT=0x602 -DGGML_USE_OPENMP -DGGML_USE_LLAMAFILE -DGGML_USE_CUDA -IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/include -IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/targets/x86_64-linux/include -DGGML_CUDA_USE_GRAPHS I NVCCFLAGS: -std=c++11 -O3 -g -use_fast_math --forward-unknown-to-host-compiler -Wno-deprecated-gpu-targets -arch=compute_89 -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 I LDFLAGS: -lcuda -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -LC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/lib64 -L/usr/lib64 -LC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/targets/x86_64-linux/lib -LC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/lib64/stubs -L/usr/lib/wsl/lib I CC: cc (GCC) 14.1.0 I CXX: c++ (GCC) 14.1.0 I NVCC: Build cuda_12.3.r12.3/compiler.33281558_0 grep: unknown option -- P BusyBox v1.37.0.git-5301-gda71f7c57 (2024-05-08 15:37:43 UTC) Usage: grep [-HhnlLoqvsrRiwFE] [-m N] [-A|B|C N] { PATTERN | -e PATTERN... | -f FILE... } [FILE]... Search for PATTERN in FILEs (or stdin) -H Add 'filename:' prefix -h Do not add 'filename:' prefix -n Add 'line_no:' prefix -l Show only names of files that match -L Show only names of files that don't match -c Show only count of matching lines -o Show only the matching part of line -q Quiet. Return 0 if PATTERN is found, 1 otherwise -v Select non-matching lines -s Suppress open and read errors -r Recurse -R Recurse and dereference symlinks -i Ignore case -w Match whole words only -x Match whole lines only -F PATTERN is a literal (not regexp) -E PATTERN is an extended regexp -m N Match up to N times per file -A N Print N lines of trailing context -B N Print N lines of leading context -C N Same as '-A N -B N' -e PTRN Pattern to match -f FILE Read pattern from file nvcc -std=c++11 -O3 -g -use_fast_math --forward-unknown-to-host-compiler -Wno-deprecated-gpu-targets -arch=compute_89 -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -D_XOPEN_SOURCE=600 -DNDEBUG -D_WIN32_WINNT=0x602 -DGGML_USE_OPENMP -DGGML_USE_LLAMAFILE -DGGML_USE_CUDA -IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/include -IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/targets/x86_64-linux/include -DGGML_CUDA_USE_GRAPHS -Xcompiler "-std=c++11 -fPIC -O3 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Xassembler -muse-unaligned-vector-move -fopenmp -Wno-array-bounds -Wno-pedantic" -c ggml/src/ggml-cuda.cu -o ggml/src/ggml-cuda.o nvcc fatal : A single input file is required for a non-link phase when an outputfile is specified make: *** [Makefile:745: ggml/src/ggml-cuda.o] Error 1
Any improvements?
I later gave up compiling on Windows and chose to use WSL (Windows Subsystem for Linux) for compilation, as I didn’t have enough time to dive deeply into the issues, and there weren’t enough reference materials for improvements.
What happened?
Description: I am attempting to compile the llama.cpp project with CUDA support enabled (GGML_CUDA=1) on a Windows system using MinGW. I have set the CUDA_DOCKER_ARCH environment variable as per the requirements, but I am encountering a compilation error related to nvcc.
Steps to Reproduce:
export CUDA_DOCKER_ARCH=compute_89
make GGML_CUDA=1
Expected Behavior: The project should compile successfully with CUDA support enabled.
Details about my terminal environment: W:\Code\CCpp\FunnyProject\llama.cpp ❯ w64devkit ~ $ printenv SCOOP=J:/Software/Scoop CONDA_PROMPT_MODIFIER=False HTTPS_PROXY=http://127.0.0.1:7890 PROGRAMFILESX86=C:/Program Files (x86) USER=WangHao LOGONSERVER=//WANGHAOITX PROGRAMFILES=C:/Program Files ALLUSERSPROFILE=C:/ProgramData POSH_THEMES_PATH=C:/Users/WangHao/AppData/Local/Programs/oh-my-posh/themes PROGRAMW6432=C:/Program Files POWERLINE_COMMAND=oh-my-posh WT_PROFILE_ID={574e775e-4f2a-5b96-ac1e-a2962a402336} SHLVL=1 HOME=C:/Users/WangHao CONDA_SHLVL=1 POSH_GIT_ENABLED=False SYSTEMDRIVE=C: ProgramFiles(x86)=C:\Program Files (x86) POWERSHELL_DISTRIBUTION_CHANNEL=MSI:Windows 10 Pro PROCESSOR_IDENTIFIER=AMD64 Family 25 Model 97 Stepping 2, AuthenticAMD SSL_CERT_FILE=J:/Software/Scoop/apps/miniconda/current/envs/tensor/Library/ssl/cacert.pem PROCESSOR_REVISION=6102 PUBLIC=C:/Users/Public W64DEVKIT=1.23.0 _CONDA_ROOT=J:/Software/Scoop/apps/miniconda/current USERDOMAIN=WANGHAOITX POSH_AZURE_ENABLED=False PROCESSOR_ARCHITECTURE=AMD64 PSMODULEPATH=C:/Users/WangHao/Documents/PowerShell/Modules;C:/Program Files/PowerShell/Modules;c:/program files/powershell/7/Modules;C:/Program Files/WindowsPowerShell/Modules;C:/Windows/system32/WindowsPowerShell/v1.0/Modules W64DEVKIT_HOME=J:/Software/MinGW LOGNAME=WangHao COMMONPROGRAMFILESX86=C:/Program Files (x86)/Common Files TEMP=C:/Users/WangHao/AppData/Local/Temp COMMONPROGRAMFILES=C:/Program Files/Common Files USERNAME=WangHao COMMONPROGRAMW6432=C:/Program Files/Common Files LOCALAPPDATA=C:/Users/WangHao/AppData/Local POSH_SHELL_VERSION=7.4.2 SESSIONNAME=Console WINDIR=C:/Windows PATH=J:/Software/MinGW/bin;J:/Software/Scoop/apps/openjdk22/current/bin;J:/Software/Scoop/apps/miniconda/current/envs/tensor;J:/Software/Scoop/apps/miniconda/current/envs/tensor/Library/mingw-w64/bin;J:/Software/Scoop/apps/miniconda/current/envs/tensor/Library/usr/bin;J:/Software/Scoop/apps/miniconda/current/envs/tensor/Library/bin;J:/Software/Scoop/apps/miniconda/current/envs/tensor/Scripts;J:/Software/Scoop/apps/miniconda/current/envs/tensor/bin;J:/Software/Scoop/apps/miniconda/current/condabin;C:/Program Files/PowerShell/7;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/bin;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/libnvvp;C:/Windows/system32;C:/Windows;C:/Windows/System32/Wbem;C:/Windows/System32/WindowsPowerShell/v1.0;C:/Windows/System32/OpenSSH;C:/Program Files (x86)/NVIDIA Corporation/PhysX/Common;C:/Program Files/NVIDIA Corporation/Nsight Compute 2023.3.0;C:/Program Files/dotnet;C:/Program Files/PowerShell/7;J:/Software/Scoop/shims;C:/Users/WangHao/AppData/Local/Microsoft/WindowsApps;J:/Software/VSCode/bin;C:/Users/WangHao/AppData/Local/Programs/oh-my-posh/bin;C:/Users/WangHao/.dotnet/tools;J:/Software/LLVM;J:/Software/LLVM/bin;J:/Software/MinGW;J:/Software/MinGW/bin;J:/Software/RuntimeLibrary/libtorch/2.4.0/build;J:/Software/RuntimeLibrary/openblas/0.3.27/build;J:/Software/RuntimeLibrary/opencv/4.8.0/build SCOOP_GLOBAL=J:/Software/Scoop OS=Windows_NT WT_SESSION=68881e97-3db7-4fea-a4d4-75c276bd15ec NUMBER_OF_PROCESSORS=16 POSH_CURSOR_LINE=4 USERPROFILE=C:/Users/WangHao TMP=C:/Users/WangHao/AppData/Local/Temp APPDATA=C:/Users/WangHao/AppData/Roaming CONDA_PYTHON_EXE=J:/Software/Scoop/apps/miniconda/current/python.exe SHELL=/bin/sh PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.CPL CommonProgramFiles(x86)=C:\Program Files (x86)\Common Files CONDA_DEFAULT_ENV=tensor BB_GLOBBING=0 ONEDRIVECONSUMER=C:/Users/WangHao/OneDrive PROGRAMDATA=C:/ProgramData SYSTEMROOT=C:\Windows USERDOMAIN_ROAMINGPROFILE=WANGHAOITX _CONDA_EXE=J:/Software/Scoop/apps/miniconda/current/Scripts/conda.exe __CONDA_OPENSLL_CERT_FILE_SET=1 NVTOOLSEXT_PATH=C:/Program Files/NVIDIA Corporation/NvToolsExt/ POSH_THEME=C:/Users/WangHao/AppData/Local/Programs/oh-my-posh/themes/peru.omp.json HOMEDRIVE=C: JAVA_HOME=J:/Software/Scoop/apps/openjdk22/current POSH_CURSOR_COLUMN=1 PWD=C:/Users/WangHao COMPUTERNAME=WANGHAOITX COMSPEC=C:\Windows\system32\cmd.exe CONDA_EXE=J:/Software/Scoop/apps/miniconda/current/Scripts/conda.exe CUDA_PATH_V12_3=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3 GIT_INSTALL_ROOT=J:/Software/Scoop/apps/git/current HOMEPATH=/Users/WangHao HTTP_PROXY=http://127.0.0.1:7890 ONEDRIVE=C:/Users/WangHao/OneDrive POSH_INSTALLER=winget CONDA_PREFIX=J:/Software/Scoop/apps/miniconda/current/envs/tensor DRIVERDATA=C:/Windows/System32/Drivers/DriverData CUDA_PATH=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3 PROCESSOR_LEVEL=25 POSH_PID=144612 WSLENV=WT_SESSION:WT_PROFILE_ID:
Name and Version
llama.cpp e54c35e4fb5777c76316a50671640e6e144c9538 Operating System: Windows 10 Compiler: GCC 14.1.0 CUDA Version: 12.3 GPU: NVIDIA GeForce RTX 4080
What operating system are you seeing the problem on?
Windows
Relevant log output