karpathy / llm.c

LLM training in simple, raw C/CUDA
MIT License
23.81k stars 2.66k forks source link

`make` fails to autodetect GPU compute capability #387

Open aaakulchyk opened 5 months ago

aaakulchyk commented 5 months ago

Running make (e.g., make test_gpt2) on my PC outputs the following:

make: __nvcc_device_query: No such file or directory
"Detected GPU compute capability: "
---------------------------------------------
→ cuDNN is manually disabled by default, run make with `USE_CUDNN=1` to try to enable
✓ OpenMP found
✓ OpenMPI found, OK to train with multiple GPUs
✓ nvcc found, including GPU/CUDA support
---------------------------------------------
cc -Ofast -Wno-unused-result -Wno-ignored-pragmas -Wno-unknown-attributes -march=native -fopenmp -DOMP   test_gpt2.c -lm -lgomp -o test_gpt2

Although my PC has RTX 4090 and, as can be seen, nvcc is found. I have already found a solution which relies on nvidia-smi rather than __nvcc_device_query (which suspiciously looks like something an intentionally hidden/temporary file) and the problem is gone. With this change, make stops complaining about __nvcc_device_query:

---------------------------------------------
→ cuDNN is manually disabled by default, run make with `USE_CUDNN=1` to try to enable
✓ OpenMP found
✓ OpenMPI found, OK to train with multiple GPUs
✓ nvcc found, including GPU/CUDA support
---------------------------------------------
cc -Ofast -Wno-unused-result -Wno-ignored-pragmas -Wno-unknown-attributes -march=native -fopenmp -DOMP   test_gpt2.c -lm -lgomp -o test_gpt2
rosslwheeler commented 5 months ago

@akulchik - what toolkit version are you using and what OS?

rosslwheeler commented 5 months ago

Just did a check on an older 11.7 Cuda SDK and the file is there. I think your installation might have a problem. I do like your change but not sure it it's urgent unless it's critical that we support the older SDKs with the auto-detect. Can you try either reinstalling or using the latest 12.4.1 SDK? The file is supposed to be installed with nvcc.

gordicaleksa commented 4 months ago

Hey @akulchik are you still having problems with this?