robbert-harms / MDT

Microstructure Diffusion Toolbox
GNU Lesser General Public License v3.0
50 stars 18 forks source link

Singularity runs, but gives semi-random results #42

Open celstark opened 2 years ago

celstark commented 2 years ago

We've used MDT for a number of years with a lot of success, running on workstations with NVIDIA GPUs. Some time back, I'd posted about issues with OpenCL Intel processing not being multi-threaded. While the problem got solved in that we used many threads, the output is never close to what the GPU numbers are. Here's a screenshot showing a GPU-derived image on the left and a CPU / Singularity image on the right:

Screenshot 2022-04-27 153229

A bit of decoding -- the arrow is pointing to a section showing the values under the cursor across multiple runs. There are two NVIDIA runs on different machines that give identical results of 0.5777... (good). There's also a run on one of these machines setting the OpenCL device to be device1, which was 'CPU - pthread-Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz (Portable Computing Language)'. This comes very, very close (0.578) and I'll attribute to different floating point units. Now, the GPU run took <1 minute and the CPU one took 934 minutes on a 20-core machine, which was a bit insane, but at least the numbers lined up.

Moving on, though, to attempts to bundle this in Singularity. I've used the supplied script, my own script, and many variants on each. In the end, I get images that are either a constant 0.5 everywhere within the mask or images like the one on the right that show something of the brain, but whose values are way, way off -- 0.0256 in the example here. It's almost as if the values are being cast into the wrong format. We run quickly as the threading works well here, but this is clearly quite wrong. I'd love to be able to fix this, but I've thrown at it what I can think of and hit the wall.

celstark commented 2 years ago

If it helps, there are some outputs that are perfectly fine and others that are borked:

In BallStick output Same: AIC, AICc, BIC, Ball, LogLikelihood, OffsetGaussian.sigma, ReturnCodes, Stick0*, UsedMask, w_ball.w.std Different: FS, S0.s0, w_ball.w, w_stick0.w

In NODDI output Same: Ball.d, NODDI?C.d, NODDI?C.phi, NODDI?C.theta, NODDI?C.vec0, OffsetGaussian.sigma, UsedMask, w{csf,ec,ic}.w.std Different: AIC, AICc, BIC, LogLikelihood, NDI, NODDI?C.dperp0, NODDI_?C.kappa, ODI, ReturnCodes, S0.s0, w_csf.w, w_stat

robbert-harms commented 2 years ago

Hi Celstark,

Thanks for reporting in such detail. From what I understand from all your experiments is that MDT works fine as long as you run it on bare-metal, but as soon as you run it within Singularity, results default to 0.5?

I am also confused on why this would happen. Possible reasons I can think of are:

The fact that it returns 0.5 is not a coincidence, this is the default starting value and if no computations are performed, this is typically what you would get as a result.

I am not very experienced with Singularity and as such I am afraid I can't be much of an help. Perhaps you could run the shell command "clinfo" within the singularity container and list the devices found in the container?

Best,

Robbert

celstark commented 1 year ago

Reviving this old issue as I've returned to hit the problem ;)

Unfortunately, neither of these ideas seem to be it. I've got trivially-tweaked Singularity images based off of the github code provided recipes that build both Intel and NVIDIA based Singularity images. I'm processing the same exact data with the same model in both.

For example:

singularity shell --nv -B /mnt/hippocampus/starkdata1 /mnt/extradata/singularity/MDT_cuda1a.sif 

Singularity> mdt-list-devices 
Device 0:
GPU - NVIDIA GeForce RTX 3070 (NVIDIA CUDA)
Device 1:
CPU - pthread-AMD Ryzen 7 5800X 8-Core Processor (Portable Computing Language)

Singularity> time mdt-model-fit NODDI_ExVivo sub-MCV747_dwi.nii sub-MCV747_dwi.prtcl sub-MCV747_dwi_mask.nii --cl-device-ind 0 -o output_MDT_cuda1a
[2023-01-03 17:45:32,087] [INFO] [mdt.lib.processing.model_fitting] [get_model_fit] - Starting intermediate optimization for generating initialization point.
[2023-01-03 17:45:32,130] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Using MDT version 1.2.6
[2023-01-03 17:45:32,130] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Preparing for model BallStick_r1
[2023-01-03 17:45:32,155] [INFO] [mdt.models.composite] [_prepare_input_data] - No volume options to apply, using all 92 volumes.
[2023-01-03 17:45:32,155] [INFO] [mdt.utils] [estimate_noise_std] - Trying to estimate a noise std.
[2023-01-03 17:45:32,157] [INFO] [mdt.utils] [estimate_noise_std] - Estimated global noise std 271.5927429199219.
[2023-01-03 17:45:32,157] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitting BallStick_r1 model
[2023-01-03 17:45:32,157] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - The 4 parameters we will fit are: ['S0.s0', 'w_stick0.w', 'Stick0.theta', 'Stick0.phi']
[2023-01-03 17:45:32,157] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Saving temporary results in output_MDT_cuda1a/BallStick_r1/tmp_results.
[2023-01-03 17:45:32,200] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 0.00%, processing next 44158 voxels (44158 voxels in total, 0 processed). Time spent: 0:00:00:00, time left: ? (d:h:m:s).
[2023-01-03 17:45:32,200] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting optimization
[2023-01-03 17:45:32,200] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using MOT version 0.11.3
[2023-01-03 17:45:32,200] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use a single precision float type for the calculations.
[2023-01-03 17:45:32,201] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using device 'GPU - NVIDIA GeForce RTX 3070 (NVIDIA CUDA)'.
[2023-01-03 17:45:32,201] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using compile flags: ('-cl-denorms-are-zero', '-cl-mad-enable', '-cl-no-signed-zeros')
[2023-01-03 17:45:32,201] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use the optimizer Powell with default settings.
[2023-01-03 17:45:33,282] [INFO] [mdt.lib.processing.model_fitting] [_process] - Finished optimization
[2023-01-03 17:45:33,282] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting post-processing
[2023-01-03 17:45:33,347] [INFO] [mdt.lib.processing.model_fitting] [_process] - Finished post-processing
[2023-01-03 17:45:33,435] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 100%
[2023-01-03 17:45:33,435] [INFO] [mdt.lib.processing.processing_strategies] [process] - Computed all voxels, now creating nifti's
[2023-01-03 17:45:33,597] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitted BallStick_r1 model with runtime 0:00:00:01 (d:h:m:s).
[2023-01-03 17:45:33,609] [INFO] [mdt.lib.processing.model_fitting] [get_model_fit] - Finished intermediate optimization for generating initialization point.
[2023-01-03 17:45:33,644] [INFO] [mdt] [fit_model] - Preparing NODDI_ExVivo with the cascaded initializations.
[2023-01-03 17:45:33,651] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Using MDT version 1.2.6
[2023-01-03 17:45:33,651] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Preparing for model NODDI_ExVivo
[2023-01-03 17:45:33,674] [INFO] [mdt.models.composite] [_prepare_input_data] - No volume options to apply, using all 92 volumes.
[2023-01-03 17:45:33,675] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitting NODDI_ExVivo model
[2023-01-03 17:45:33,675] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - The 7 parameters we will fit are: ['S0.s0', 'w_stat.w', 'w_ic.w', 'NODDI_IC.theta', 'NODDI_IC.phi', 'NODDI_IC.kappa', 'w_ec.w']
[2023-01-03 17:45:33,675] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Saving temporary results in output_MDT_cuda1a/NODDI_ExVivo/tmp_results.
[2023-01-03 17:45:33,716] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 0.00%, processing next 44158 voxels (44158 voxels in total, 0 processed). Time spent: 0:00:00:00, time left: ? (d:h:m:s).
[2023-01-03 17:45:33,716] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting optimization
[2023-01-03 17:45:33,716] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using MOT version 0.11.3
[2023-01-03 17:45:33,716] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use a single precision float type for the calculations.
[2023-01-03 17:45:33,716] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using device 'GPU - NVIDIA GeForce RTX 3070 (NVIDIA CUDA)'.
[2023-01-03 17:45:33,716] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using compile flags: ('-cl-denorms-are-zero', '-cl-mad-enable', '-cl-no-signed-zeros')
[2023-01-03 17:45:33,716] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use the optimizer Powell with default settings.
[2023-01-03 17:46:21,152] [INFO] [mdt.lib.processing.model_fitting] [_process] - Finished optimization
[2023-01-03 17:46:21,152] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting post-processing
[2023-01-03 17:46:21,249] [INFO] [mdt.lib.processing.model_fitting] [_process] - Finished post-processing
[2023-01-03 17:46:21,374] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 100%
[2023-01-03 17:46:21,374] [INFO] [mdt.lib.processing.processing_strategies] [process] - Computed all voxels, now creating nifti's
[2023-01-03 17:46:21,636] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitted NODDI_ExVivo model with runtime 0:00:00:47 (d:h:m:s).

real    0m51.014s
user    0m50.069s
sys 0m0.649s

I've run this a few times, even using different Singularity images built slightly differently and the results are always consistent. Runs inside Singularity on the GPU - check.

Now the Intel build:

(base) stark@titan:/mnt/hippocampus/starkdata1/Limoli/exvivo_flash/derivatives/mrtrix/mdt/sub-MCV747$ singularity shell -B /mnt/hippocampus/starkdata1 /mnt/extradata/singularity/MDT_intel1a.sif 
Singularity> mdt-list-devices 
Device 0:
CPU - AMD Ryzen 7 5800X 8-Core Processor              (Intel(R) OpenCL)
Singularity> time mdt-model-fit NODDI_ExVivo sub-MCV747_dwi.nii sub-MCV747_dwi.prtcl sub-MCV747_dwi_mask.nii --cl-device-ind 0 -o output_MDT_intel_titan
[2023-01-03 22:13:58,968] [INFO] [mdt.lib.processing.model_fitting] [get_model_fit] - Starting intermediate optimization for generating initialization point.
[2023-01-03 22:13:59,028] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Using MDT version 1.2.6
[2023-01-03 22:13:59,028] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Preparing for model BallStick_r1
[2023-01-03 22:13:59,071] [INFO] [mdt.models.composite] [_prepare_input_data] - No volume options to apply, using all 92 volumes.
[2023-01-03 22:13:59,071] [INFO] [mdt.utils] [estimate_noise_std] - Trying to estimate a noise std.
[2023-01-03 22:13:59,075] [INFO] [mdt.utils] [estimate_noise_std] - Estimated global noise std 271.5927429199219.
[2023-01-03 22:13:59,075] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitting BallStick_r1 model
[2023-01-03 22:13:59,075] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - The 4 parameters we will fit are: ['S0.s0', 'w_stick0.w', 'Stick0.theta', 'Stick0.phi']
[2023-01-03 22:13:59,075] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Saving temporary results in output_MDT_intel_titan/BallStick_r1/tmp_results.
[2023-01-03 22:13:59,130] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 0.00%, processing next 44158 voxels (44158 voxels in total, 0 processed). Time spent: 0:00:00:00, time left: ? (d:h:m:s).
[2023-01-03 22:13:59,130] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting optimization
[2023-01-03 22:13:59,130] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using MOT version 0.11.3
[2023-01-03 22:13:59,130] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use a single precision float type for the calculations.
[2023-01-03 22:13:59,130] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using device 'CPU - AMD Ryzen 7 5800X 8-Core Processor              (Intel(R) OpenCL)'.
[2023-01-03 22:13:59,130] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using compile flags: ('-cl-denorms-are-zero', '-cl-mad-enable', '-cl-no-signed-zeros')
[2023-01-03 22:13:59,130] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use the optimizer Powell with default settings.
/usr/lib/python3/dist-packages/pyopencl/__init__.py:63: CompilerWarning: Non-empty compiler output encountered. Set the environment variable PYOPENCL_COMPILER_OUTPUT=1 to see more.
  "to see more.", CompilerWarning)
[2023-01-03 22:14:05,226] [INFO] [mdt.lib.processing.model_fitting] [_process] - Finished optimization
[2023-01-03 22:14:05,226] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting post-processing
[2023-01-03 22:14:05,450] [INFO] [mdt.lib.processing.model_fitting] [_process] - Finished post-processing
[2023-01-03 22:14:05,556] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 100%
[2023-01-03 22:14:05,556] [INFO] [mdt.lib.processing.processing_strategies] [process] - Computed all voxels, now creating nifti's
[2023-01-03 22:14:05,720] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitted BallStick_r1 model with runtime 0:00:00:06 (d:h:m:s).
[2023-01-03 22:14:05,728] [INFO] [mdt.lib.processing.model_fitting] [get_model_fit] - Finished intermediate optimization for generating initialization point.
[2023-01-03 22:14:05,767] [INFO] [mdt] [fit_model] - Preparing NODDI_ExVivo with the cascaded initializations.
[2023-01-03 22:14:05,774] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Using MDT version 1.2.6
[2023-01-03 22:14:05,774] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Preparing for model NODDI_ExVivo
[2023-01-03 22:14:05,808] [INFO] [mdt.models.composite] [_prepare_input_data] - No volume options to apply, using all 92 volumes.
[2023-01-03 22:14:05,809] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitting NODDI_ExVivo model
[2023-01-03 22:14:05,809] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - The 7 parameters we will fit are: ['S0.s0', 'w_stat.w', 'w_ic.w', 'NODDI_IC.theta', 'NODDI_IC.phi', 'NODDI_IC.kappa', 'w_ec.w']
[2023-01-03 22:14:05,809] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Saving temporary results in output_MDT_intel_titan/NODDI_ExVivo/tmp_results.
[2023-01-03 22:14:05,862] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 0.00%, processing next 44158 voxels (44158 voxels in total, 0 processed). Time spent: 0:00:00:00, time left: ? (d:h:m:s).
[2023-01-03 22:14:05,862] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting optimization
[2023-01-03 22:14:05,862] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using MOT version 0.11.3
[2023-01-03 22:14:05,862] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use a single precision float type for the calculations.
[2023-01-03 22:14:05,862] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using device 'CPU - AMD Ryzen 7 5800X 8-Core Processor              (Intel(R) OpenCL)'.
[2023-01-03 22:14:05,862] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using compile flags: ('-cl-denorms-are-zero', '-cl-mad-enable', '-cl-no-signed-zeros')
[2023-01-03 22:14:05,862] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use the optimizer Powell with default settings.
[2023-01-03 22:15:32,990] [INFO] [mdt.lib.processing.model_fitting] [_process] - Finished optimization
[2023-01-03 22:15:32,991] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting post-processing
[2023-01-03 22:15:33,432] [INFO] [mdt.lib.processing.model_fitting] [_process] - Finished post-processing
[2023-01-03 22:15:33,554] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 100%
[2023-01-03 22:15:33,554] [INFO] [mdt.lib.processing.processing_strategies] [process] - Computed all voxels, now creating nifti's
[2023-01-03 22:15:33,789] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitted NODDI_ExVivo model with runtime 0:00:01:27 (d:h:m:s).

real    1m35.980s
user    24m6.364s
sys 0m0.943s

We run and we get output, but it's not quite right and differs from run to run as if some matrix actually isn't getting initialized. Yes, that's an AMD processor, so here's a 144-core Intel one:

singularity shell -B /mnt/hippocampus/starkdata1 /mnt/hippocampus/starkdata1/Mouse_DWI_tutorial/code/MDT_intel1a.sif
Singularity> mdt-list-devices 
Device 0:
CPU - Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz (Intel(R) OpenCL)
Singularity> time mdt-model-fit NODDI_ExVivo sub-MCV747_dwi.nii sub-MCV747_dwi.prtcl sub-MCV747_dwi_mask.nii --cl-device-ind 0 -o output_MDT_intel_wario
[2023-01-03 22:30:54,282] [INFO] [mdt.lib.processing.model_fitting] [get_model_fit] - Starting intermediate optimization for generating initialization point.
[2023-01-03 22:30:54,409] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Using MDT version 1.2.6
[2023-01-03 22:30:54,409] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Preparing for model BallStick_r1
[2023-01-03 22:30:54,499] [INFO] [mdt.models.composite] [_prepare_input_data] - No volume options to apply, using all 92 volumes.
[2023-01-03 22:30:54,500] [INFO] [mdt.utils] [estimate_noise_std] - Trying to estimate a noise std.
[2023-01-03 22:30:54,505] [INFO] [mdt.utils] [estimate_noise_std] - Estimated global noise std 271.5927429199219.
[2023-01-03 22:30:54,505] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitting BallStick_r1 model
[2023-01-03 22:30:54,505] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - The 4 parameters we will fit are: ['S0.s0', 'w_stick0.w', 'Stick0.theta', 'Stick0.phi']
[2023-01-03 22:30:54,505] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Saving temporary results in output_MDT_intel_wario/BallStick_r1/tmp_results.
[2023-01-03 22:30:54,604] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 0.00%, processing next 44158 voxels (44158 voxels in total, 0 processed). Time spent: 0:00:00:00, time left: ? (d:h:m:s).
[2023-01-03 22:30:54,604] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting optimization
[2023-01-03 22:30:54,604] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using MOT version 0.11.3
[2023-01-03 22:30:54,605] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use a single precision float type for the calculations.
[2023-01-03 22:30:54,605] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using device 'CPU - Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz (Intel(R) OpenCL)'.
[2023-01-03 22:30:54,605] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using compile flags: ('-cl-denorms-are-zero', '-cl-mad-enable', '-cl-no-signed-zeros')
[2023-01-03 22:30:54,605] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use the optimizer Powell with default settings.
/usr/lib/python3/dist-packages/pyopencl/__init__.py:63: CompilerWarning: Non-empty compiler output encountered. Set the environment variable PYOPENCL_COMPILER_OUTPUT=1 to see more.
  "to see more.", CompilerWarning)
[2023-01-03 22:30:57,504] [INFO] [mdt.lib.processing.model_fitting] [_process] - Finished optimization
[2023-01-03 22:30:57,506] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting post-processing
[2023-01-03 22:30:58,073] [INFO] [mdt.lib.processing.model_fitting] [_process] - Finished post-processing
[2023-01-03 22:30:58,220] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 100%
[2023-01-03 22:30:58,220] [INFO] [mdt.lib.processing.processing_strategies] [process] - Computed all voxels, now creating nifti's
[2023-01-03 22:30:58,494] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitted BallStick_r1 model with runtime 0:00:00:03 (d:h:m:s).
[2023-01-03 22:30:58,514] [INFO] [mdt.lib.processing.model_fitting] [get_model_fit] - Finished intermediate optimization for generating initialization point.
[2023-01-03 22:30:58,580] [INFO] [mdt] [fit_model] - Preparing NODDI_ExVivo with the cascaded initializations.
[2023-01-03 22:30:58,591] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Using MDT version 1.2.6
[2023-01-03 22:30:58,591] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Preparing for model NODDI_ExVivo
[2023-01-03 22:30:58,656] [INFO] [mdt.models.composite] [_prepare_input_data] - No volume options to apply, using all 92 volumes.
[2023-01-03 22:30:58,657] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitting NODDI_ExVivo model
[2023-01-03 22:30:58,657] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - The 7 parameters we will fit are: ['S0.s0', 'w_stat.w', 'w_ic.w', 'NODDI_IC.theta', 'NODDI_IC.phi', 'NODDI_IC.kappa', 'w_ec.w']
[2023-01-03 22:30:58,657] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Saving temporary results in output_MDT_intel_wario/NODDI_ExVivo/tmp_results.
[2023-01-03 22:30:58,753] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 0.00%, processing next 44158 voxels (44158 voxels in total, 0 processed). Time spent: 0:00:00:00, time left: ? (d:h:m:s).
[2023-01-03 22:30:58,753] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting optimization
[2023-01-03 22:30:58,753] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using MOT version 0.11.3
[2023-01-03 22:30:58,753] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use a single precision float type for the calculations.
[2023-01-03 22:30:58,753] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using device 'CPU - Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz (Intel(R) OpenCL)'.
[2023-01-03 22:30:58,753] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using compile flags: ('-cl-denorms-are-zero', '-cl-mad-enable', '-cl-no-signed-zeros')
[2023-01-03 22:30:58,753] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use the optimizer Powell with default settings.
[2023-01-03 22:31:19,729] [INFO] [mdt.lib.processing.model_fitting] [_process] - Finished optimization
[2023-01-03 22:31:19,730] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting post-processing
[2023-01-03 22:31:20,698] [INFO] [mdt.lib.processing.model_fitting] [_process] - Finished post-processing
[2023-01-03 22:31:20,891] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 100%
[2023-01-03 22:31:20,892] [INFO] [mdt.lib.processing.processing_strategies] [process] - Computed all voxels, now creating nifti's
[2023-01-03 22:31:21,353] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitted NODDI_ExVivo model with runtime 0:00:00:22 (d:h:m:s).

real    0m29.667s
user    38m5.241s
sys 0m3.145s

The attached screenshot shows 2 runs in the Singularity NVIDIA on the top (identical) and 3 runs (2 on the "titan" AMD machine and 1 on the "wario" Intel machine). The bands in one are at 0.5 exactly.

Screenshot 2023-01-03 225049

robbert-harms commented 1 year ago

Hi celstrark,

I am sorry to hear that MDT gave you so much problems. The problem is that OpenCL is so ill-supported by different vendors.

At this point I don't know how to help you. Family and work are taking 120% of my time already.

Are you interested in developing MDT further?

Best,

Robbert

celstark commented 1 year ago

I understand the constraints and am under similar ones here. On our end, MDT works wonderfully so long as we run it in a more typical setup. So, for this step in the processing, we will just run things manually. I poked around for a bit to see if I could spot something to no avail, but I'll keep this in the back of my mind.

Thanks for all you've done with the package!

Craig

fkm24 commented 12 months ago

Hi Craig,

Many thanks for sharing the Singularity recipe. I have implemented your recipe on our HPC and came across the same issue, with ODI values fixed at 0.5.

It sounds like you have a working solution to run MDT on your workstation though, would you be happy to share more about your setup so we can continue to implement MDT? That will be much appreciated!

Best Wishes, Eli

celstark commented 12 months ago

On the workstation, it's running via CUDA. Here's the definition file I used:

Bootstrap: docker
From: nvidia/opencl

%setup
        mkdir -p $SINGULARITY_ROOTFS/src
        cp containers/silent.cfg $SINGULARITY_ROOTFS/src

%post
        # install dependencies
        apt-get update && apt-get install -y lsb-core wget locales

        # install mdt
        export DEBIAN_FRONTEND=noninteractive
        apt-get update && apt-get install -y software-properties-common && add-apt-repository ppa:robbert-harms/cbclab
        apt-get update && apt-get install -y python3-mdt python3-pip
        pip3 install tatsu==4.2.6

As you can see, not much to it -- really just your Dockerfile.nvidia converted over. I never did get a working setup to run on the CPU in a container.

fkm24 commented 12 months ago

Thanks very much for sharing this, Craig. I have implemented this on our HPC and the script was able to detect the GPUs.

Device 1:
GPU - NVIDIA A100-SXM4-80GB (NVIDIA CUDA)
Device 2:
GPU - NVIDIA A100-SXM4-80GB (NVIDIA CUDA)
Device 3:
GPU - NVIDIA A100-SXM4-80GB (NVIDIA CUDA)

However, the script is still getting stuck at the “We will use the optimiser Powell with default settings” (https://github.com/robbert-harms/MDT/issues/58).

I think it might be related to my OpenCL too. When I ran clinfo, it says:

 NOTE:   your OpenCL library only supports OpenCL 2.2,
                but some installed platforms support OpenCL 3.0.
                Programs using 3.0 features may crash
                or behave unexepectedly

I will ask the system administrators to update these drivers and see if that helps.

Best Wishes, Eli