Runtime errors related to ninja_build

diana273 commented 2 years ago

I tried to run your tutorial from bootstrap/mnist/train.ipynb, but it crashes because of runtime errors related to ninja_build (see details below). When I switch to "device = torch.device('cpu')" it works fine, but there is a problem when I use "device = torch.device('cuda')". I have all the requirements properly installed. While troubleshooting the build.ninja file I discovered that the x86 VS linker is used instead of 64 bit one.

Do you have any idea why this errors are occurring?

Traceback (most recent call last): File "C:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\utils\cpp_extension.py", line 1667, in _run_ninja_build subprocess.run( File "C:\Users\AppData\Local\Programs\Python\Python38\lib\subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:/Users/github/intel-py38-lava011/lava-dl/tutorials/lava/lib/dl/bootstrap/mnist/mnist_train_example.py", line 116, in output = net.forward(input, mode) File "C:/Users/github/intel-py38-lava011/lava-dl/tutorials/lava/lib/dl/bootstrap/mnist/mnist_train_example.py", line 53, in forward x = block(x, mode=m) File "C:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "C:\Users\github\intel-py38-lava011\lava-dl\src\lava\lib\dl\bootstrap\block\cuba.py", line 36, in forward return AbstractBlock.forward(self, x, mode) File "C:\Users\github\intel-py38-lava011\lava-dl\src\lava\lib\dl\bootstrap\block\base.py", line 124, in forward x = self._forward_snn(x, sample=True) File "C:\Users\github\intel-py38-lava011\lava-dl\src\lava\lib\dl\bootstrap\block\base.py", line 89, in _forward_snn x = self.neuron(z) File "C:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(input, **kwargs) File "C:\Users\github\intel-py38-lava011\lava-dl\src\lava\lib\dl\slayer\neuron\cuba.py", line 433, in forward _, voltage = self.dynamics(input) File "C:\Users\github\intel-py38-lava011\lava-dl\src\lava\lib\dl\slayer\neuron\cuba.py", line 351, in dynamics current = leaky_integrator.dynamics( File "C:\Users\github\intel-py38-lava011\lava-dl\src\lava\lib\dl\slayer\neuron\dynamics\leaky_integrator.py", line 95, in dynamics output = Accelerated.leaky_integrator.dynamics( File "C:\Users\github\intel-py38-lava011\lava-dl\src\lava\lib\dl\slayer\utils\utils.py", line 14, in get return staticmethod(self.fget).get(None, owner)() File "C:\Users\github\intel-py38-lava011\lava-dl\src\lava\lib\dl\slayer\neuron\dynamics\leaky_integrator.py", line 40, in leaky_integrator Accelerated.module = load( File "C:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\utils\cpp_extension.py", line 1079, in load return _jit_compile( File "C:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\utils\cpp_extension.py", line 1292, in _jit_compile _write_ninja_file_and_build_library( File "C:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\utils\cpp_extension.py", line 1404, in _write_ninja_file_and_build_library _run_ninja_build( File "C:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\utils\cpp_extension.py", line 1683, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'dynamics': [1/2] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc --generate-dependencies-with-compile --dependency-output leaky_integrator.cuda.o.d -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=dynamics -DTORCH_API_INCLUDE_EXTENSION_H -IC:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\include -IC:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\include\TH -IC:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\AppData\Local\Programs\Python\Python38\Include -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_50,code=compute_50 -gencode=arch=compute_50,code=sm_50 -c C:\Users\github\intel-py38-lava011\lava-dl\src\lava\lib\dl\slayer\neuron\dynamics\leaky_integrator.cu -o leaky_integrator.cuda.o FAILED: leaky_integrator.cuda.o C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc --generate-dependencies-with-compile --dependency-output leaky_integrator.cuda.o.d -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=dynamics -DTORCH_API_INCLUDE_EXTENSION_H -IC:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\include -IC:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\include\TH -IC:\Users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\AppData\Local\Programs\Python\Python38\Include -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_50,code=compute_50 -gencode=arch=compute_50,code=sm_50 -c C:\Users\github\intel-py38-lava011\lava-dl\src\lava\lib\dl\slayer\neuron\dynamics\leaky_integrator.cu -o leaky_integrator.cuda.o C:/Users/github/intel-py38-lava011/python38_venv1/lib/site-packages/torch/include\c10/macros/Macros.h(189): warning C4067: unexpected tokens following preprocessor directive - expected a newline c:\users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\include\pybind11\detail/common.h(108): warning C4005: 'HAVE_SNPRINTF': macro redefinition c:\users\appdata\local\programs\python\python38\include\pyerrors.h(315): note: see previous definition of 'HAVE_SNPRINTF' C:/Users/github/intel-py38-lava011/python38_venv1/lib/site-packages/torch/include\c10/macros/Macros.h(189): warning C4067: unexpected tokens following preprocessor directive - expected a newline c:\users\github\intel-py38-lava011\python38_venv1\lib\site-packages\torch\include\pybind11\detail/common.h(108): warning C4005: 'HAVE_SNPRINTF': macro redefinition c:\users\appdata\local\programs\python\python38\include\pyerrors.h(315): note: see previous definition of 'HAVE_SNPRINTF' C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include\vcruntime.h(184): error: invalid redeclaration of type name "size_t"

C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include\vcruntime_new.h(66): error: first parameter of allocation function must be of type "size_t"

C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include\vcruntime_new.h(71): error: first parameter of allocation function must be of type "size_t"

C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include\vcruntime_new.h(77): error: first parameter of allocation function must be of type "size_t"

C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include\vcruntime_new.h(82): error: first parameter of allocation function must be of type "size_t"

C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include\vcruntime_new.h(184): error: first parameter of allocation function must be of type "size_t"

C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include\vcruntime_new.h(199): error: first parameter of allocation function must be of type "size_t"

C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include\vcruntime_new_debug.h(23): error: first parameter of allocation function must be of type "size_t"

C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include\vcruntime_new_debug.h(31): error: first parameter of allocation function must be of type "size_t"

C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include\type_traits(168): error: class template "std::_Is_function" has already been defined

C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include\type_traits(212): error: class template "std::_Is_memfunptr" has already been defined

C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include\type_traits(1849): error: class template "std::result_of" has already been defined

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\crt/common_functions.h(117): error: first parameter of allocation function must be of type "size_t"

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\crt/common_functions.h(118): error: first parameter of allocation function must be of type "size_t"

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\crt/common_functions.h(240): error: first parameter of allocation function must be of type "size_t"

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\crt/common_functions.h(241): error: first parameter of allocation function must be of type "size_t"

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(104): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(105): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(109): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(110): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(111): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(112): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(113): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(114): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(115): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(116): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(117): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(118): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(119): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(120): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(122): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(123): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(124): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(125): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(126): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(127): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(128): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(129): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(130): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(131): error: asm operand type size(8) does not match type/size implied by constraint 'r'

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\sm_32_intrinsics.hpp(132): error: asm operand type size(8) does not match type/size implied by constraint 'r'

Error limit reached. 100 errors detected in the compilation of "C:/Users/AppData/Local/Temp/tmpxft_00002444_00000000-10_leaky_integrator.cpp1.ii". Compilation terminated. leaky_integrator.cu ninja: build stopped: subcommand failed.

mgkwill commented 2 years ago

Hi @diana273 thanks for this bug issue!

It looks like @bamsumit is looking into it.

daevem commented 1 year ago

Hi, I was just wondering if this has been resolved. I have been using lava-dl 0.2.0 and noticed that this issue especially raised when I was trying to use two different types of neuron models. I then found that all the models were being jit-compiled and loaded using the same name (dynamics), which caused conflicts. I changed the names (e.g., cuba_dynamics or alif_dynamics) so that they would differ for every neuron and have stopped receiving this kind of error, but I'm unsure if this is ok for other use cases.

EDIT: Manually setting the path to the correct version of Visual Studio was also part of the solution to the problem for me.

bamsumit commented 1 year ago

Hi @daevem, the issue for windows has not been resolved. I don't see the issue with same name dynamics in linux environment.

import torch
from lava.lib.dl import slayer
device = torch.device('cuda')
cuba = slayer.neuron.cuba.Neuron(threshold=10, current_decay=0.5, voltage_decay=0.5).to(device)
alif = slayer.neuron.alif.Neuron(threshold=10, current_decay=0.5, voltage_decay=0.5, threshold_step=1, threshold_decay=0, refractory_decay=0).to(device)
x = torch.rand([1, 10, 100]).to(torch.device('cuda'))
alif(cuba(x))

It is fine to change the name of dynamics to unique name if that work sin Windows. If you have made progress toward fixing the windows issue, the steps would be useful for other windows users.

ParsaOmidi commented 1 year ago

The same error keeps appearing for my Linux machines (I tested it on 2 different machines). Are there any solutions to this problem? The code runs on the CPU fine, but slow. My environment is:

Ubuntu 20.04, python 3.9, PyTorch 2, cudnn 11.7

RuntimeError: Error building extension 'dynamics': [1/1] c++ leaky_integrator.cuda.o -shared -L/home/rescue/anaconda3/envs/lavaEnv/lib/python3.9/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda -ltorch -ltorch_python -L/home/rescue/anaconda3/lib64 -lcudart -o dynamics.so FAILED: dynamics.so c++ leaky_integrator.cuda.o -shared -L/home/rescue/anaconda3/envs/lavaEnv/lib/python3.9/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda -ltorch -ltorch_python -L/home/rescue/anaconda3/lib64 -lcudart -o dynamics.so /usr/bin/ld: cannot find -lcudart collect2: error: ld returned 1 exit status ninja: build stopped: subcommand failed.

ParsaOmidi commented 1 year ago

The same error keeps appearing for my Linux machines (I tested it on 2 different machines). Are there any solutions to this problem? The code runs on the CPU fine, but slow. My environment is:

Ubuntu 20.04, python 3.9, PyTorch 2, cudnn 11.7

RuntimeError: Error building extension 'dynamics': [1/1] c++ leaky_integrator.cuda.o -shared -L/home/rescue/anaconda3/envs/lavaEnv/lib/python3.9/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda -ltorch -ltorch_python -L/home/rescue/anaconda3/lib64 -lcudart -o dynamics.so FAILED: dynamics.so c++ leaky_integrator.cuda.o -shared -L/home/rescue/anaconda3/envs/lavaEnv/lib/python3.9/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda -ltorch -ltorch_python -L/home/rescue/anaconda3/lib64 -lcudart -o dynamics.so /usr/bin/ld: cannot find -lcudart collect2: error: ld returned 1 exit status ninja: build stopped: subcommand failed.

On a Windows machine, I received a similar error. As I tracked down the error, I found this in the file cpp_extension.py:

cl_paths=subprocess.check_output(['where', 'cl']).decode(*SUBPROCESS_DECODE_ARGS).split('\r\n').

This indicates that cl.exe cannot be found on your computer. To resolve this issue, I installed a new version of MSVC C++ and then found the directory in MSVC that contained cl.exe. I then added the directory to my system PATH via "Environment Variables". After rebooting the computer, everything worked perfectly. Hope this helps someone.

bamsumit commented 1 year ago

Hi @ParsaOmidi, the issue has not been resolved for Windows. It's a pytorch issue with compiling extensions for windows. For your linux system, it looks like cuda compile library is missing. It is most likely because you do not have cuda compiler: nvcc installed or the path is not configured correctly.

Try:

nvcc --version

to check if it is installed. If it not installed please install it first. Make sure the cuda version for nvcc matches your nvidia-runtime cuda version and your pytorch cuda version.

Only if it does not work after nvcc installation: you might need to export some environment variables like this:

export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

R-Gaurav commented 6 months ago

I was facing a similar issue on Ubuntu 22.04, and as per @bamsumit's comment above, I installed the cuda toolkit to resolve my issue and it helped. However, there are some nuances to installing nvcc for different GPUs which I would like to share here; hopefully useful to someone else too!

TLDR:

Check the Compute Capability of your NVIDIA GPU and install the correct version of nvidia toolkit, i.e., nvcc supported for your GPU to get rid of the compilation issues. You can find the supported mappings here.

More details with my trials/failures/successes below.

I have got three machines, one with RTX 2080, another with RTX A2000, and last one with RTX 4060 Ti. All the three machine are running Ubuntu 22.04 and nvidia-smi on them reports the following ouput.

RTX 2080 and RTX 4060 Ti:

NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2

RTX A2000:

NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2

I am mentioning the installation case one by one below.

RTX A2000 -- Compute Capability: 8.6

The installation through the lava_dl-0.5.0 binary was all smooth until I faced a ninja build issue (similar to the one reported here) when I ran my lava network. Checked my nvcc --version and it produced:

Command 'nvcc' not found, but can be installed with:
sudo apt install nvidia-cuda-toolkit

Upon installing nvidia-cuda-toolkit via the above command my ninja build issue was resolved; and nvcc --version produced:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0

Note that cuda version of the nvcc above is 11.5, which is supported on 8.6. Although, I could not find the cuda symlink or directory in the path /usr/local/ and I guess I did not set the env vars either (unlike as suggested by @bamsumit above).

RTX 4060 Ti -- Compute Capability: 8.9

Installation through lava_dl-0.5.0 binary was a breeze until I faced a ninja build issue again and upon checking the nvcc --version it produced the same above error output to install nvidia-cuda-toolkit. I did the same but to my surprise it did not resolve the issue; though the nvcc --version worked successfully after the apt install, and produced the same above output of cuda version being 11.5. After some extensive search I got to understand that nvidia-cuda-toolkit's (or nvcc's) cuda version should be supported by the GPU -- as per the mapping linked above. Therefore I installed the latest 12.2 cuda toolkit from here supported for 8.9 Compute Capability. Note that I installed only the toolkit and not the driver -- you will get the command prompt options to do so while installing. This time I did find the cuda symlink pointing to cuda-12.2 in the path /usr/local/ as used above by @bamsumit. Upon setting the suggested env vars: PATH and LD_LIBRARY_PATH I was able to get past that build issue and my network compiled successfully. Now I am getting the following output of nvcc --version:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:16:58_PDT_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0

Note that cuda version above is 12.2, which is supported on 8.9.

RTX 2080 -- Compute Capability: 7.5

Here too sudo apt install nvidia-cuda-toolkit installed the 11.5 cuda version of nvcc and it didn't work. For 7.5 Compute Capability, the max supported cuda toolkit version is 10.2 and although I installed it following the same above process for RTX 4060 Ti, the compilation of my lava network still failed with following error:

.
.
.
nvcc fatal   : Value 'c++17' is not defined for option 'std'
ninja: build stopped: subcommand failed.

which I suppose has got something to do with setting up the CMake properly as described here, although I haven't investigated it further yet.

NOTE: If you have mistakenly installed nvidia-cuda-toolkit via apt install, and need to install the cuda toolkit directly from NVIDIA's website, do purge your current nvidia-cuda-toolkit install before proceeding ahead; I followed the steps here -- I executed only sudo apt-get purge nvidia-cuda-toolkit (I suppose --auto-remove purges the nvidia drivers too which I did not want).

Hope this detailed info helps you!

lava-nc / lava-dl

Runtime errors related to ninja_build #14