Deconvolution as a preprocessing option

teamtomo / membrain-seg

membrane segmentation in 3D for cryo-ET

Other

47 stars 12 forks source link

Deconvolution as a preprocessing option #32

Closed rdrighetto closed 10 months ago

rdrighetto commented 10 months ago

Hi,

I implemented the deconvolution filter from Warp (a.k.a. "dimifilter" or "tom_deconv") as an option in tomo_preprocessing.

The implementation is based off https://github.com/C-CINA/focustools. While that implementation used numexpr to accelerate the large array calculations, here I switched to pure numpy to avoid adding one more dependency to membrain_seg. However, I feel the performance could certainly be improved by incorporating something like numexpr, numba, or maybe even torch (on GPU)? Suggestions are very welcome. It takes ~4 minutes on my AMD EPYC CPU to deconvolve a 928x928x464 tomogram, but a tomogram that is only slightly bigger (1024x1024x512) already takes ~8 minutes.

The code passes the pre-commit tests, but if any further changes are required please let me know.

LorenzLamm commented 10 months ago

Thanks for the PR, Ricardo! :)

The functionalities look good to me and the filtering performance is really nice. However, as mentioned in the comments also, I'm wondering whether it may be more elegant to use the focustools package as dependency and implement the filter based on this? This may be easier maintenance-wise, and I also assume that the numexpr dependency is not so bad, especially if it improves the efficiency.

Let me know what you think

rdrighetto commented 10 months ago

Regarding focustools, I don't think including it as a dependency is a good idea (please see detailed comment above).

As for numexpr, that yes is a mature package that I consider to be a low-risk dependency. Let me just do a few more tests with it to see if the performance gains are really significant here.

rdrighetto commented 10 months ago

Unfortunately, numexpr did not give meaningful performance gains over pure numpy. I might not be using it the right way, or tweaking its options is needed to obtain good speedups. This is probably overkill, so will stick to pure numpy for the time being.

LorenzLamm commented 10 months ago

Looks great :) Maybe, in a follow-up, we could rename some variables (like "S") to more meaningful names to improve readability for the code. But for now, I think we can merge