ZaidQureshi / bam

BSD 2-Clause "Simplified" License
126 stars 32 forks source link

failed to install cuda on ubuntu20.04.3, need expert help #41

Open gaowayne opened 2 hours ago

gaowayne commented 2 hours ago

Describe the bug A clear and concise description of what the bug is. failed to install cuda, our kernel is ubuntu20.04.3, kernel version is 5.8

root@salab-hpedl380g11-01:~/wayne/cuda# apt-get install cuda-toolkit
Reading package lists... Done
Building dependency tree       
Reading state information... Done
You might want to run 'apt --fix-broken install' to correct these.
The following packages have unmet dependencies:
 libnvidia-decode-560 : Depends: libnvidia-compute-560 (= 560.35.03-0ubuntu1) but 560.35.03-0ubuntu0~gpu20.04.4 is to be installed
 nvidia-driver-560-open : Depends: libnvidia-compute-560 (= 560.35.03-0ubuntu1) but 560.35.03-0ubuntu0~gpu20.04.4 is to be installed
                          Recommends: libnvidia-compute-560:i386 (= 560.35.03-0ubuntu1)
                          Recommends: libnvidia-decode-560:i386 (= 560.35.03-0ubuntu1)
                          Recommends: libnvidia-encode-560:i386 (= 560.35.03-0ubuntu1)
                          Recommends: libnvidia-fbc1-560:i386 (= 560.35.03-0ubuntu1)
                          Recommends: libnvidia-gl-560:i386 (= 560.35.03-0ubuntu1)
E: Unmet dependencies. Try 'apt --fix-broken install' with no packages (or specify a solution).
root@salab-hpedl380g11-01:~/wayne/cuda# apt-get --fix-broken install cuda-toolkit
Reading package lists... Done
Building dependency tree       
Reading state information... Done
You might want to run 'apt --fix-broken install' to correct these.
The following packages have unmet dependencies:
 libnvidia-decode-560 : Depends: libnvidia-compute-560 (= 560.35.03-0ubuntu1) but 560.35.03-0ubuntu0~gpu20.04.4 is to be installed
 nvidia-driver-560-open : Depends: libnvidia-compute-560 (= 560.35.03-0ubuntu1) but 560.35.03-0ubuntu0~gpu20.04.4 is to be installed
                          Recommends: libnvidia-compute-560:i386 (= 560.35.03-0ubuntu1)
                          Recommends: libnvidia-decode-560:i386 (= 560.35.03-0ubuntu1)
                          Recommends: libnvidia-encode-560:i386 (= 560.35.03-0ubuntu1)
                          Recommends: libnvidia-fbc1-560:i386 (= 560.35.03-0ubuntu1)
                          Recommends: libnvidia-gl-560:i386 (= 560.35.03-0ubuntu1)
E: Unmet dependencies. Try 'apt --fix-broken install' with no packages (or specify a solution).
root@salab-hpedl380g11-01:~/wayne/cuda# 

To Reproduce Steps to reproduce the behavior:

  1. install ubuntu20.04.3 OS,
  2. make sure kernel is 5.8
  3. install kernel driver, it works infe
  4. try to install cuda driver then build GIDS code
  5. see above failure Expected behavior A clear and concise description of what you expected to happen. we can install cuda well Screenshots If applicable, add screenshots to help explain your problem.

Machine Setup (please complete the following information):

Additional context Add any other context about the problem here. Add as many description as possible to help you out faster. This is a system's setup, knowing about the system is critical to understand the problem.

gaowayne commented 2 hours ago

@msharmavikram hello buddy, after I run --fix-broken, above error is gone. but I suffer the cuda build issue after I build libnvm. but make benchmark failed

[100%] Linking CXX shared library lib/libnvm.so
[100%] Built target libnvm
root@salab-hpedl380g11-01:~/wayne/bam/bam/build# make benchmarks
[ 29%] Built target libnvm
[ 32%] Building NVCC (Device) object benchmarks/vectoradd/CMakeFiles/vectoradd-benchmark-module.dir/vectoradd-benchmark-module_generated_main.cu.o
nvcc fatal   : Unsupported gpu architecture 'compute_80'
CMake Error at vectoradd-benchmark-module_generated_main.cu.o.cmake:220 (message):
  Error generating
  /root/wayne/bam/bam/build/benchmarks/vectoradd/CMakeFiles/vectoradd-benchmark-module.dir//./vectoradd-benchmark-module_generated_main.cu.o

make[3]: *** [benchmarks/vectoradd/CMakeFiles/vectoradd-benchmark-module.dir/build.make:65: benchmarks/vectoradd/CMakeFiles/vectoradd-benchmark-module.dir/vectoradd-benchmark-module_generated_main.cu.o] Error 1
make[2]: *** [CMakeFiles/Makefile2:1109: benchmarks/vectoradd/CMakeFiles/vectoradd-benchmark-module.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:361: CMakeFiles/benchmarks.dir/rule] Error 2
make: *** [Makefile:164: benchmarks] Error 2
root@salab-hpedl380g11-01:~/wayne/bam/bam/build# make benchmarks
[ 29%] Built target libnvm
[ 32%] Building NVCC (Device) object benchmarks/vectoradd/CMakeFiles/vectoradd-benchmark-module.dir/vectoradd-benchmark-module_generated_main.cu.o
nvcc fatal   : Unsupported gpu architecture 'compute_80'
CMake Error at vectoradd-benchmark-module_generated_main.cu.o.cmake:220 (message):
  Error generating
  /root/wayne/bam/bam/build/benchmarks/vectoradd/CMakeFiles/vectoradd-benchmark-module.dir//./vectoradd-benchmark-module_generated_main.cu.o

make[3]: *** [benchmarks/vectoradd/CMakeFiles/vectoradd-benchmark-module.dir/build.make:65: benchmarks/vectoradd/CMakeFiles/vectoradd-benchmark-module.dir/vectoradd-benchmark-module_generated_main.cu.o] Error 1
make[2]: *** [CMakeFiles/Makefile2:1109: benchmarks/vectoradd/CMakeFiles/vectoradd-benchmark-module.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:361: CMakeFiles/benchmarks.dir/rule] Error 2
make: *** [Makefile:164: benchmarks] Error 2