TomographicImaging / CIL

A versatile python framework for tomographic imaging
https://tomographicimaging.github.io/CIL/
Apache License 2.0
97 stars 45 forks source link

cuda error of iterative method for rather large data #1656

Open EB79 opened 10 months ago

EB79 commented 10 months ago

Hi,

I am writing to inquire about a problem we encountered while using the FISTA iterative algorithm for reconstruction. Our computational facilities, which consist of a Dell Tower Precision 7960 with 256GB RAM and a 48GB Nvidia GPU, are quite powerful. However, the large size of our projection data has led to a CUDA error.

I am wondering if there is a way to manage our data and still achieve reconstruction without resorting to any data manipulation techniques such as down-sampling or resizing. We are interested to know if your block formalism can potentially solve our problem.

To provide you with more information, I have included a link below where you can access our projections and acquisition geometry.

https://drive.google.com/drive/folders/1DJZ1N-vYJ2Slah9KAlPMrsYLLQo7B4a4?usp=sharing

Thank you for your attention to this matter.

Sincerely, Erfan

paskino commented 9 months ago

Hi @EB79 what projection operator are you using? TIGRE or ASTRA?

EB79 commented 9 months ago

Hi @EB79 what projection operator are you using? TIGRE or ASTRA?

I am using ASTRA

gfardell commented 9 months ago

Although it looks like a CUDA error it can be raised by running out of RAM rather than VRAM as CUDA may be allocating or pinning memory in RAM. The back projectors (both TIGRE and ASTRA) should be able to handle datasets larger that VRAM so it's not normally an issue. However the txt file shows your dataset size as 580 projection 1290 x 995. That's not a very big dataset and I'm surprised you'd have issues with either.

The acquisition data takes ~3GB and the image data ~5GB. Some iterative algorithms need to store many times then data size during computation so it can increase significantly, but even so I can't imagine you're running out of RAM if this is the only thing running on the server.

Can you monitor your RAM and GPU usage as you run the script and see if you are really hitting any limits? Otherwise could you send us some code that recreates the problem. We don't need the data, but the code setting up and running the algorithm could give us some insight.

Additionally you could try changing the projection operators to TIGRE and see if that helps - but I can't really see what would be causing the problem.

EB79 commented 9 months ago

Although it looks like a CUDA error it can be raised by running out of RAM rather than VRAM as CUDA may be allocating or pinning memory in RAM. The back projectors (both TIGRE and ASTRA) should be able to handle datasets larger that VRAM so it's not normally an issue. However the txt file shows your dataset size as 580 projection 1290 x 995. That's not a very big dataset and I'm surprised you'd have issues with either.

The acquisition data takes ~3GB and the image data ~5GB. Some iterative algorithms need to store many times then data size during computation so it can increase significantly, but even so I can't imagine you're running out of RAM if this is the only thing running on the server.

Can you monitor your RAM and GPU usage as you run the script and see if you are really hitting any limits? Otherwise could you send us some code that recreates the problem. We don't need the data, but the code setting up and running the algorithm could give us some insight.

Additionally you could try changing the projection operators to TIGRE and see if that helps - but I can't really see what would be causing the problem.

Thank you for your attention, I've provided code with JupyterNoteBook format which can be found in the mentioned google drive link above.