CERN / TIGRE

TIGRE: Tomographic Iterative GPU-based Reconstruction Toolbox
BSD 3-Clause "New" or "Revised" License
527 stars 180 forks source link

Ax:Siddon_projection invalid device symbol #526

Closed xxcxn closed 3 months ago

xxcxn commented 3 months ago

Expected Behavior

Actual Behavior

Hello, after setting up the TIGRE environment, I get an error when I run d03_generateData.py.../Common/CUDA/TIGRE_common.cpp (7): kernel fail../Common/CUDA/TIGRE_common.cpp (14): Ax:Siddon_projection invalid device symbol Process finished with exit code 1 the driver is 453.94-data-center-tesla-desktop-win10-64bit-international

Code to reproduce the problem (If applicable)

Specifications

AnderBiguri commented 3 months ago

Hi @xxcxn this seems to be a CUDA error. Can you describe which GPU (with which compute capability), which CUDA and which CUDA driver you have? Also which operating system. Some of this info seem sin the original post, but not all.

This error most commonly happens because the code has not been compiled for your compute capability. It can be either because TIGRE has a bug (that we can quickly fix) or because your compute capability is not supported.

xxcxn commented 3 months ago

The operating system is Windows 10, the GPU is Tesla K20Xm, CUDA Version: 11.0, and the driver is NVIDIA 453.94.

AnderBiguri commented 3 months ago

Hum, that seems to be compute capability 3.5 (please double check, I can't find documentation for Tesla K20Xm, only Tesla K20X) and therefore should be supported. Have you compiled the code in the same machine you are running it on?

xxcxn commented 3 months ago

Yes, the compute capability of Tesla K20Xm is 3.5, and I compiled the code on the same machine where I am running it.

AnderBiguri commented 3 months ago

I am confused then! Which .xml are you using to compile?

Also, can you print what the cuda_version that the compilation script catches is? here: https://github.com/CERN/TIGRE/blob/f6b03dc906006515da82f4c60279ec505ceaa97a/MATLAB/Compile.m#L71

xxcxn commented 3 months ago

I'm compiling my code in the PyCharm Integrated Development Environment. The error occurred while running compile.m indicates the issue is related to the usage of mex. The system cannot find the file mex_CUDA_win64.xml. Please ensure that you are in the correct current directory and check the spelling of 'mex_CUDA_win64.xml'. Error in Compile (line 47)mex -setup:'mex_CUDA_win64.xml'. In addition, I have set up the MATLAB runtime environment for TIGRE on the same computer, and then configured Python. Will this affect the Python setup?

AnderBiguri commented 3 months ago

@xxcxn Python and MATLAB are completely different. In here I am assuming you are only working with the MATLAB one, because that is what you suggested in the original post. Both Compilations are completely different, so better to not mix them up! Your error seems to be that either you did not rename the xml file as per the installation instructions, or that you are not running the Compile.m from the folder where it lives.

But if this is the case, I am confused how you got the original error, as that requires at least some success in compilation....

xxcxn commented 3 months ago

I've found that my Windows system is Server 2016, will this have an impact?

xxcxn commented 3 months ago

The result is as follows when I run example.py: 0: Tesla K20Xm 1: Tesla K20Xm {'name': 'Tesla K20Xm', 'devices': [0, 1]} ../Common/CUDA/TIGRE_common.cpp (7): kernel fail ../Common/CUDA/TIGRE_common.cpp (14): Ax:Siddon_projection invalid device symbol

AnderBiguri commented 3 months ago

@xxcxn I can not help you at MATLAB and python at the same time while you are not specifying which one you are using!!! Everything I said so far in this issue is only valid for MATLAB compilation, as that is what you said you have in the original post.

Have you been doing python all along?!

xxcxn commented 3 months ago

I'm really sorry, I'm trying to set up the Python environment.

AnderBiguri commented 3 months ago

@xxcxn I have updated setup.py to explicitly add cc35, can you retry to compile after updating to the latest repo?

xxcxn commented 3 months ago

The result of the execution is: Traceback (most recent call last):File "D:\TIGRE5\TIGRE\Python\setup.py", line 371, in sdist=sys.argv[1] == 'sdist',IndexError: list index out of range

AnderBiguri commented 3 months ago

@xxcxn I can not reproduce.

did you run pip install . (correct) or just called the setup.py directly (incorrect)?

xxcxn commented 3 months ago

It's very strange, everything went back to normal after I redownloaded TIGRE and ran it. It's really very odd. I'm very sorry to bother you, and thank you very much for your help!

AnderBiguri commented 3 months ago

No worries! Indeed, redownloading it (or git pull -ing) was required to get the updates I added.

So it works now? Let me know if you have more issues!

xxcxn commented 3 months ago

Everything is fine now, thank you very much!

AnderBiguri commented 3 months ago

Good to hear! thanks for reporting this, issues like this make TIGRE better :) Let me know if you have more issues