Closed benoitberanger closed 2 months ago
For DenoiseImage
, it's a difference in defaults, the search radius is 2 in the CLI program and 3 in ANTsPy. If I call ants.denoise_image(img, r=2, v=1)
, the difference in performance is similar to that for N4.
@ntustison @stnava shall we harmonize defaults, which one to adopt? I'll go with faster (2) unless you have a preference to make antspy the standard.
Yeah, I'm all for harmonizing the defaults and I'd go with what's in the original ANTs DenoiseImage.
For denoiseimage in particular, I would hope you could normalize with the OG implementation minc_anlm
from minc-toolkit-v2
Is there a usage with defaults you could paste here?
DenoiseImage was ported from Jose's original Matlab code. Minc code was not referenced at all.
Well minc_anlm
was written by/with Jose, given the citation has L Collins of the MNI as one of the senior authors 👍🏻
$ minc_anlm
This program implements adaptative non-local denoising algorithm published in
Jose V. Manjon, Pierrick Coupe, Luis Marti-Bonmati, D. Louis Collins, Montserrat Robles "Adaptive non-local means denoising of MR images with spatially varying noise levels" Journal of Magnetic Resonance Imaging Volume 31, Issue 1, pages 192–203, January 2010
DOI: 10.1002/jmri.22003
I profiled n4_bias_correction with the line_profiler / kernprof, the library function execution accounts for 99.4% of the execution time. So there's not a lot of work happening at the wrapper level.
Yes, I realize that. But you were referring to a specific implementation in the context of defaults and that's why I clarified that it was Jose's original Matlab code.
Just a bit more historical context---I happened to be invited by an MNI-adjacent friend for a get-together during MICCAI 2013 in Nagoya, Japan. Fortunately, I sat right across the table from Jose and, after discussing common interests (such our enjoyment of Luis Miguel), he realized I was "one of the ANTs guys" and he asked me if I would like to put his denoising algorithm in ANTs. I said sure and he pointed me to his Matlab code which I eventually ported to ITK-style. After the FreeSurfer folk began using the implementation in their pipeline a couple years ago, I asked Jose about the possibility of making it an ITK module and he was all for it.
I think there might also be optimization differences in the ITK / ANTs compilation.
It seems CMAKE_BUILD_TYPE=Release
is only set on Mac?
Also the ITK compilation has -O2
where an ANTs superbuild has -O3
For
DenoiseImage
, it's a difference in defaults, the search radius is 2 in the CLI program and 3 in ANTsPy. If I callants.denoise_image(img, r=2, v=1)
, the difference in performance is similar to that for N4.
Correct, I did not notice the difference with r
default parameter.
Here is what I have now :
DenoiseImage | r=2 | r=3 |
---|---|---|
CLI | 76s | 179s |
Python | 248s | 602s |
Thanks for testing, @benoitberanger
Would you mind trying out #705 ? If you have the Github CLI, you can do
gh pr checkout 705
It appears to close the gap on my Mac.
Using 550400a043773ea9c97bb9b2e73caf935d0f3f98 in #705 here is the new computation times :
DenoiseImage | r=2 |
---|---|
CLI | 76s |
Python | 73s |
So it went from 248s to 73s !
Wow! Thanks for reporting this
Describe the bug There is a massive difference of computation time between Terminal CLI and the Python wrappers for
N4BiasFieldCorrection
andDenoiseImage
From the terminal : N4BiasFieldCorrection : 22s DenoiseImage : 76s
From Python: N4BiasFieldCorrection : 33s DenoiseImage : 602s
To reproduce I use a classic 3DT2 in 0.8mm iso:
From the terminal :
From Python :
Expected behavior Since I built from source ANTs and ANTsPy, I would expect roughly the same computation. x1.5
N4BiasFieldCorrection
is unexpected but ok, however x8 forDenoiseImage
is very weird.ANTsPy installation (please complete the following information):
OS: [ Linux 4.15.0-20-generic x86_64 // Linux Mint 19 ]
python -m pip install .
]Additional context When running both tests, I can see in
htop
that all 16 CPUs are running at 100%, with both Terminal CLI and Python wrappers. So it's not an obvious multi-threading problem.