SPECFEM / specfem2d

SPECFEM2D simulates forward and adjoint seismic wave propagation in two-dimensional acoustic, (an)elastic, poroelastic or coupled acoustic-(an)elastic-poroelastic media, with Convolution PML absorbing conditions.

https://specfem.org

GNU General Public License v3.0

203 stars 147 forks source link

GPU configuration: The problem is maybe -arch sm_13 instead of -arch sm_11 in the Makefile, please doublecheck #1199

Open trapprb8 opened 9 months ago

trapprb8 commented 9 months ago

When I start a simulation in gpu mode I get the following error message:

Error in setConst_hprime_xx: invalid device symbol
The problem is maybe -arch sm_13 instead of -arch sm_11 in the Makefile, please doublecheck

I am trying to configure with a Quadro P4000, which should be Pascal architecture, and therefore cuda8 should be used in configuration I guess (according the overview in the makefile, see below)?

I used the following code:

$ ./configure FC=gfortran CC=gcc --with-mpi MPIFC=mpif90 USE_BUNDLED_SCOTCH=1 --with-cuda=cuda8 CUDA_LIB=/usr/local/cuda/lib64
$ make

Overview in makefile:

# CUDA architecture / code version
# Fermi   (not supported): -gencode=arch=compute_10,code=sm_10
# Tesla   (Tesla C2050, GeForce GTX 480): -gencode=arch=compute_20,code=sm_20
# Tesla   (cuda4, K10, Geforce GTX 650, GT 650m): -gencode=arch=compute_30,code=sm_30
# Kepler  (cuda5, K20) : -gencode=arch=compute_35,code=sm_35
# Kepler  (cuda6.5, K80): -gencode=arch=compute_37,code=sm_37
# Maxwell (cuda6.5+/cuda7, Quadro K2200): -gencode=arch=compute_50,code=sm_50
# Pascal  (cuda8,P100, GeForce GTX 1080, Titan): -gencode=arch=compute_60,code=sm_60
# Volta   (cuda9, V100): -gencode=arch=compute_70,code=sm_70
# Turing  (cuda10, T4, GeForce RTX 2080): -gencode=arch=compute_75,code=sm_75
# Ampere  (cuda11, A100, GeForce RTX 3080): -gencode=arch=compute_80,code=sm_80
# Hopper  (cuda12, H100): -gencode=arch=compute_90,code=sm_90

danielpeter commented 9 months ago

the Quadro P4000 has CUDA compute capability 6.1. that means you will likely have to modify the Makefile a bit after configuration and instead of

-gencode=arch=compute_60,code=sm_60

use:

-gencode=arch=compute_61,code=sm_61

trapprb8 commented 9 months ago

Thank you for your answer! :) Unfortunately, that didn't work yet, the error stays the same. What I did now was:

in Makefile.in: GENCODE_60 = -gencode=arch=compute_61,code=\"sm_61,compute_61\" in Makefile: GENCODE_60 = -gencode=arch=compute_61,code=\"sm_61,compute_61\" `` GENCODE = $(GENCODE_60) $(FC_DEFINE)GPU_DEVICE_Pascal

this line stays same, just wanted to show for completion

and run $ ./configure FC=gfortran CC=gcc --with-mpi MPIFC=mpif90 USE_BUNDLED_SCOTCH=1 --with-cuda=cuda8 CUDA_LIB=/usr/local/cuda/lib64 $ make ``

danielpeter commented 9 months ago

great, thanks for the quick feedback!

note that the Makefile gets created by running the ./configure script. so, you would only need to either modify the Makefile.in before running the configuration, of the Makefile after running the configuration.

trapprb8 commented 9 months ago

Hi Daniel, thanks again! :) I also did this, however it does not work. Still the same error. We are only talking about the Makefile.in and Makefile in the main directory, right? I uploaded the two files:

Makefile.txt Makefile.in.txt

danielpeter commented 9 months ago

yes, the GPU architecture is specified only in the main Makefiles in the root directory, Makefile.in and the generated one Makefile.

can you be more specific what did not work, the compilation even with the modifications as you suggested, or the modification of only one of the Makefiles? that is, do you still get the error

Error in setConst_hprime_xx: invalid device symbol

even with the modification

GENCODE_60 = -gencode=arch=compute_61,code=\"sm_61,compute_61\"

in these Makefiles? if so, then what are your CUDA toolkit and CUDA driver versions?

trapprb8 commented 9 months ago

Exactly, the error is the same as before: Error in setConst_hprime_xx: invalid device symbol The problem is maybe -arch sm_13 instead of -arch sm_11 in the Makefile, please doublecheck The Cuda version is 11.8, nvcc --version gives me: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Sep_21_10:33:58_PDT_2022 Cuda compilation tools, release 11.8, V11.8.89 Build cuda_11.8.r11.8/compiler.31833905_0

danielpeter commented 9 months ago

could you also add the output of the command nvidia-smi to see the driver version on your system?

trapprb8 commented 9 months ago

This output is:


+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P4000        On   | 00000000:05:00.0  On |                  N/A |
| 46%   30C    P0    28W / 105W |    240MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2980      G   /usr/lib/xorg/Xorg                192MiB |
|    0   N/A  N/A      3511      G   cinnamon                           30MiB |
|    0   N/A  N/A      4673      G   /usr/lib/firefox/firefox           13MiB |
+-----------------------------------------------------------------------------+

danielpeter commented 9 months ago

tricky... according to the toolkit documentation, that driver version looks okay for CUDA 11.8 and it should support the compute capability 6.1. unfortunately, I can't reproduce it as I don't have access to such a GPU card. the code works however on most older and newer cards, so I would expect this to be a driver version and CUDA toolkit issue.

to double check the compute capability of your card, could you compile and run the little helper tool in utils/GPU_tools/ folder on your system:

cd ~/<specfem-directory>/utils/GPU_tools/
nvcc --gpu-architecture=sm_60 -o check_cuda_device check_cuda_device.cu
./check_cuda_device

the tool will provide an info output with the compute capability listed.

in the past CIG-seismo forum somebody was able to run the code on a Quadro P6000, I think with a CUDA 9.1 version. you could try to downgrade CUDA driver & runtime version to see if this solves the issue.

trapprb8 commented 9 months ago

Hi dear,

here is the output of the helper tool:

`` found number of CUDA devices = 1

GPU device id: 0

Device Name = Quadro P4000

memory: totalGlobalMem (in MB, dividing by powers of 1024): 8116.562500 totalGlobalMem (in GB, dividing by powers of 1024): 7.926331

totalGlobalMem (in MB, dividing by powers of 1000): 8510.833008 totalGlobalMem (in GB, dividing by powers of 1000): 8.510833

sharedMemPerBlock (in bytes): 49152

blocks: Maximum number of registers per block: 65536 Maximum number of threads per block: 1024 Maximum size of each dimension of a block: 1024 x 1024 x 64 Maximum sizes of each dimension of a grid: 2147483647 x 65535 x 65535

features: Compute capability of the device = 6.1 multiProcessorCount: 14 canMapHostMemory: TRUE deviceOverlap: TRUE

0: GPU memory usage (dividing by powers of 1024): used = 319.625000 MB, free = 7796.937500 MB, total = 8116.562500 MB

0: GPU memory usage (dividing by powers of 1000): used = 335.151104 MB, free = 8175.681536 MB, total = 8510.832640 MB

number of total devices: 1 ``

Ok.. Maybe I will try to downgrade the Cuda Toolkit then!

Zzhe0315-RA commented 1 week ago

Tricky...Is this problem solved?

Zzhe0315-RA commented 1 week ago

Tricky...Is this problem solved?

I have the same problem as you, and try to the same way to deal it , but i faild.

trapprb8 commented 1 week ago

Tricky...Is this problem solved?

I have the same problem as you, and try to the same way to deal it , but i faild.

Hey:) No, unfortunately it was not solved..