Problems with the new GPU double precision

bergolho commented 3 years ago

Hello Sachetto,

When I tried to update my repository to the latest version with the GPU double precision I was not able to compile the Bondarenko model. I started to receive the following error:

~/Github/MonoAlg3D_C/src/models_library/bondarenko/bondarenko_2004_GPU.cu(266): error: calling a host function("std::pow<double, float> ") from a device function("RHS_gpu") is not allowed

~/Github/MonoAlg3D_C/src/models_library/bondarenko/bondarenko_2004_GPU.cu(266): error: identifier "std::pow<double, float> " is undefined in device code

In order to solve this issue, I checked the 'build.sh' file and commented the CFLAGS for the double precision. After this change, I was able to build everything. Although, when I tried to run the "purkinje_with_fibrosis.ini" example I received several warnings with the message:

Solving EDO 1 times before solving PDE
Starting simulation
t = 0.00000, Iterations = 6, Error Norm = 2.642679e-17, Number of Tissue Cells:132322, Tissue CG Iterations time: 6207 us
            , Iterations = 12, Error Norm = 4.517128e-17, Number of Purkinje Cells:582, Purkinje CG Iterations time: 642 us, Total Iteration time: 485641 us
Accepting solution with error > 1.021493 
Accepting solution with error > 1.059086 
Accepting solution with error > 1.100347 
Accepting solution with error > 1.142385 
Accepting solution with error > 1.188148 
Accepting solution with error > 1.237055

Then after some iterations the solver returned NaN for the Purkinje solution.

t = 130.00000, Iterations = 10, Error Norm = 1.558992e-17, Number of Tissue Cells:132322, Tissue CG Iterations time: 7125 us
            , Iterations = 12, Error Norm = 6.437341e-17, Number of Purkinje Cells:582, Purkinje CG Iterations time: 717 us, Total Iteration time: 389427 us
t = 132.00000, Iterations = 10, Error Norm = 1.469444e-17, Number of Tissue Cells:132322, Tissue CG Iterations time: 6988 us
            , Iterations = 0, Error Norm = nan, Number of Purkinje Cells:582, Purkinje CG Iterations time: 97 us, Total Iteration time: 385947 us
t = 134.00000, Iterations = 10, Error Norm = 1.300754e-17, Number of Tissue Cells:132322, Tissue CG Iterations time: 7097 us
            , Iterations = 0, Error Norm = nan, Number of Purkinje Cells:582, Purkinje CG Iterations time: 96 us, Total Iteration time: 394784 us
t = 136.00000, Iterations = 10, Error Norm = 1.414033e-17, Number of Tissue Cells:132322, Tissue CG Iterations time: 7262 us
            , Iterations = 0, Error Norm = nan, Number of Purkinje Cells:582, Purkinje CG Iterations time: 91 us, Total Iteration time: 384568 us
t = 138.00000, Iterations = 10, Error Norm = 1.212957e-17, Number of Tissue Cells:132322, Tissue CG Iterations time: 7005 us
            , Iterations = 0, Error Norm = nan, Number of Purkinje Cells:582, Purkinje CG Iterations time: 89 us, Total Iteration time: 379773 us

I don't know if this is an error in my version of CUDA or if I need to include something new to the Purkinje cellular models, because I see that are some changes in the structure of the cellular models from now on. Can you help me with this problem ?

rsachetto commented 3 years ago

Hi Lucas. This is a problem with both your cuda version and your GPU. Older GPUs doesn't support double precision calculations and I changed de code to compile both the CPU and GPU versions of the models with the same precision and Bondarenko fails with single precision. Please send me your Cuda version so I can change the code to allow different precisions with old versions.

bergolho commented 3 years ago

My current CUDA version is this one:

berg@localhost:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

rsachetto commented 3 years ago

@bergolho, Could you please try this new version.

bergolho commented 3 years ago

@rsachetto I was able to build all the models, but now compilation stops with an error in the 'ten_tusscher':

[INFO] COMPILING OBJECT /home/berg/Github/MonoAlg3D_C/build_release/libten_tusscher_2006/objs/ten_tusscher_2006_RS_CPU.c.o
/usr/local/cuda/bin/gcc -DCELL_MODEL_REAL_DOUBLE -O3 -fopenmp -std=gnu99 -fno-strict-aliasing  -Wall -Wno-stringop-truncation -Wno-unused-function -Wno-char-subscripts -Wno-unused-result -Wno-switch -Werror=implicit-function-declaration -DCOMPILE_CUDA -I/usr/local/cuda/include -DCOMPILE_GUI  -fPIC -c /home/berg/Github/MonoAlg3D_C/src/models_library/ten_tusscher/ten_tusscher_2006_RS_CPU.c -o /home/berg/Github/MonoAlg3D_C/build_release/libten_tusscher_2006/objs/ten_tusscher_2006_RS_CPU.c.o
[INFO] COMPILING OBJECT /home/berg/Github/MonoAlg3D_C/build_release/libten_tusscher_2006/objs/ten_tusscher_2006_RS_GPU.cu.o
/opt/cuda/bin/nvcc /home/berg/Github/MonoAlg3D_C/src/models_library/ten_tusscher/ten_tusscher_2006_RS_GPU.cu -c  -o /home/berg/Github/MonoAlg3D_C/build_release/libten_tusscher_2006/objs/ten_tusscher_2006_RS_GPU.cu.o -ccbin /usr/local/cuda/bin/gcc -m64 -Xcompiler \"-DCELL_MODEL_REAL_DOUBLE\",\"-O3\",\"-fopenmp\",\"-fno-strict-aliasing\",\"-Wall\",\"-Wno-stringop-truncation\",\"-Wno-unused-function\",\"-Wno-char-subscripts\",\"-Wno-unused-result\",\"-Wno-switch\",\"-DCOMPILE_CUDA\",\"-I/usr/local/cuda/include\",\"-DCOMPILE_GUI\",\"-fPIC\", -DNVCC -I/usr/local/cuda/include
/home/berg/Github/MonoAlg3D_C/src/models_library/ten_tusscher/ten_tusscher_2006_RS_GPU.cu(237): error: calling a __host__ function("std::pow<double, float> ") from a __device__ function("RHS_gpu") is not allowed

/home/berg/Github/MonoAlg3D_C/src/models_library/ten_tusscher/ten_tusscher_2006_RS_GPU.cu(237): error: identifier "std::pow<double, float> " is undefined in device code

/home/berg/Github/MonoAlg3D_C/src/models_library/ten_tusscher/ten_tusscher_2006_RS_GPU.cu(240): error: calling a __host__ function("std::pow<double, float> ") from a __device__ function("RHS_gpu") is not allowed

/home/berg/Github/MonoAlg3D_C/src/models_library/ten_tusscher/ten_tusscher_2006_RS_GPU.cu(240): error: identifier "std::pow<double, float> " is undefined in device code

/home/berg/Github/MonoAlg3D_C/src/models_library/ten_tusscher/ten_tusscher_2006_RS_GPU.cu(250): error: calling a __host__ function("std::pow<double, float> ") from a __device__ function("RHS_gpu") is not allowed

rsachetto commented 3 years ago

Some old models were not using the macros to configure the precision. Try with the last commit, please.

bergolho commented 3 years ago

@rsachetto All models are compiled sucessfully and the 'purkinje_with_fibrosis.ini' example runs without errors.

rsachetto / MonoAlg3D_C

Problems with the new GPU double precision #31