CERN / TIGRE

TIGRE: Tomographic Iterative GPU-based Reconstruction Toolbox
BSD 3-Clause "New" or "Revised" License
560 stars 183 forks source link

optimisation of fdk python setup #331

Closed gfardell closed 2 years ago

gfardell commented 2 years ago

Changes to the python FDK weights and filtering as these are far slower than the backprojector.

Weights changes: Weights were calculated per-projection with no dependency on projection, with an unnecessary data copy at the end.

Filtering changes: Use scipy.fftpack over np.fft (you already have a dependency on scipy so hopefully this isn't an issue) Ensure arrays are float32 through each step. Pre-calculated scaling factor, originally this is applied sequentially to the array (5 loops over the array I believe) Packed 2 images in to complex32 to perform fft on 2projections at a time (rather than having imaginary array set to 0s)

The original times (called via CIL on a GTX1080) TIGRE FDK AcquistionData shape (2048, 1024, 1024) ImageData shape (1024, 1024, 1024) completed in 213.4332273s

The times with these changes: TIGRE FDK AcquistionData shape (2048, 1024, 1024) ImageData shape (1024, 1024, 1024) completed in 90.9856166s

and for comparison: ASTRA FDK AcquistionData shape (1024, 2048, 1024) ImageData shape (1024, 1024, 1024) completed in 212.62975440000002s

Filtering is still 50seconds of that total, so could be improved by multithreading.

And a pretty picture to show it still works :D image

AnderBiguri commented 2 years ago

Great! thanks a lot!

I always had in mind that at some point I was going to write CUDA filtering functions, so I have always ignored them a bit, but this is great, thanks!

There are some conflicts with the current master (I assume yours is based on TIGRE 2.0 or something from January). Mind trying to solve those? Happy to help if you have any doubts.

gfardell commented 2 years ago

There are some conflicts with the current master (I assume yours is based on TIGRE 2.0 or something from January). Mind trying to solve those? Happy to help if you have any doubts.

Yep, just doing it now. Of course I was in 2.1 :D

I had a go with IPP and OpenMP and got the filtering time down to <2s for that size dataset. So I think we'll likely implement that in CIL and just call you backprojectors. But I thought you might as well benefit from the simple changes otherwise!

gfardell commented 2 years ago

Maybe check it runs for you first and I didn't screw up the merge. I'll build it now to be sure, but needed it on github for the condabuild.

AnderBiguri commented 2 years ago

Absolutely, let me double check if it runs well on a random dataset that I have around before merging! Thanks!

AnderBiguri commented 2 years ago

Thanks @gfardell !! :)