robbert-harms / MDT

Microstructure Diffusion Toolbox
GNU Lesser General Public License v3.0
50 stars 18 forks source link

pyopencl._cl.RuntimeError #12

Closed ElijahMak closed 5 years ago

ElijahMak commented 5 years ago

Hello MDT Team,

Thanks very much for putting this toolbox up. I have read your paper with great interest and now I am keen to integrate MDT into our pipeline to process NODDI data. However, I am running into an issue after installation.

The early checks seemed fine.

mdt-list-devices
Device 0:
CPU - Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (Apple)
Device 1:
GPU - AMD Radeon R9 M380 Compute Engine (Apple)

and MDT launches the GUI as expected. I selected the CPU as the GPU option resulted in system crashes (Mac OSX).

objc[7979]: Class FIFinderSyncExtensionHost is implemented in both /System/Library/PrivateFrameworks/FinderKit.framework/Versions/A/FinderKit (0x7fff89d421d0) and /System/Library/PrivateFrameworks/FileProvider.framework/OverrideBundles/FinderSyncCollaborationFileProviderOverride.bundle/Contents/MacOS/FinderSyncCollaborationFileProviderOverride (0x130881dc8). One of the two will be used. Which one is undefined.
2019-03-19 16:07:03.469 Python[7979:525286] [QL] Can't get plugin bundle info at file:///Applications/GarageBand.app/Contents/Library/QuickLook/GarageBandQLGenerator.qlgenerator/
2019-03-19 16:07:03.469 Python[7979:525286] [QL] Can't get plugin bundle info at file:///Applications/GarageBand.app/Contents/Library/QuickLook/LogicXQLGenerator.qlgenerator/
[2019-03-19 16:07:32,726] [INFO] [mdt.lib.model_fitting] [get_model_fit] - Starting intermediate optimization for generating initialization point.
[2019-03-19 16:07:32,850] [INFO] [mdt.lib.model_fitting] [get_model_fit] - Starting intermediate optimization for generating initialization point.
[2019-03-19 16:07:32,945] [INFO] [mdt.lib.model_fitting] [_apply_user_provided_initialization_data] - Preparing model BallStick_r1 with the user provided initialization data.
[2019-03-19 16:07:32,970] [INFO] [mdt.lib.model_fitting] [fit_composite_model] - Not recalculating BallStick_r1 model
[2019-03-19 16:07:32,990] [INFO] [mdt.lib.model_fitting] [get_model_fit] - Finished intermediate optimization for generating initialization point.
[2019-03-19 16:07:33,355] [INFO] [mdt.lib.model_fitting] [_apply_user_provided_initialization_data] - Preparing model NODDI with the user provided initialization data.
[2019-03-19 16:07:33,397] [INFO] [mdt.lib.model_fitting] [fit_composite_model] - Using MDT version 0.20.3
[2019-03-19 16:07:33,397] [INFO] [mdt.lib.model_fitting] [fit_composite_model] - Preparing for model NODDI
[2019-03-19 16:07:33,397] [INFO] [mdt.lib.model_fitting] [fit_composite_model] - Current cascade: ['NODDI']
[2019-03-19 16:07:33,731] [INFO] [mdt.models.composite] [_prepare_input_data] - No volume options to apply, using all 64 volumes.
[2019-03-19 16:07:33,731] [INFO] [mdt.utils] [estimate_noise_std] - Trying to estimate a noise std.
[2019-03-19 16:07:33,733] [WARNING] [mdt.utils] [_compute_noise_std] - Failed to obtain a noise std for this subject. We will continue with an std of 1.
[2019-03-19 16:07:33,734] [INFO] [mdt.lib.model_fitting] [_model_fit_logging] - Fitting NODDI model
[2019-03-19 16:07:33,734] [INFO] [mdt.lib.model_fitting] [_model_fit_logging] - The 6 parameters we will fit are: ['S0.s0', 'w_ic.w', 'NODDI_IC.theta', 'NODDI_IC.phi', 'NODDI_IC.kappa', 'w_ec.w']
[2019-03-19 16:07:33,734] [INFO] [mdt.lib.model_fitting] [fit_composite_model] - Saving temporary results in /Users/elijahmak_imac/Library/Mobile Documents/com~apple~CloudDocs/noddi_test/18411_dti/output/b0_brain_mask/NODDI/tmp_results.
[2019-03-19 16:07:33,796] [INFO] [mdt.lib.processing_strategies] [_process_chunk] - Computations are at 0.00%, processing next 100000 voxels (165818 voxels in total, 0 processed). Time spent: 0:00:00:00, time left: ? (d:h:m:s).
[2019-03-19 16:07:33,826] [INFO] [mdt.lib.processing_strategies] [_process] - Starting optimization
[2019-03-19 16:07:33,826] [INFO] [mdt.lib.processing_strategies] [_process] - Using MOT version 0.9.1
[2019-03-19 16:07:33,826] [INFO] [mdt.lib.processing_strategies] [_process] - We will use a single precision float type for the calculations.
[2019-03-19 16:07:33,826] [INFO] [mdt.lib.processing_strategies] [_process] - Using device 'CPU - Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (Apple)'.
[2019-03-19 16:07:33,826] [INFO] [mdt.lib.processing_strategies] [_process] - Using compile flags: ['-cl-denorms-are-zero', '-cl-mad-enable', '-cl-no-signed-zeros']
[2019-03-19 16:07:33,827] [INFO] [mdt.lib.processing_strategies] [_process] - We will use the optimizer Powell with default settings.
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/gui/utils.py", line 84, in _decorator
    response = dec_func(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/gui/model_fit/tabs/fit_model_tab.py", line 614, in run
    mdt.fit_model(*self._args, **self._kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/__init__.py", line 197, in fit_model
    inits = get_optimization_inits(model_name, input_data, output_folder, cl_device_ind=cl_device_ind)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/__init__.py", line 89, in get_optimization_inits
    return get_optimization_inits(model_name, input_data, output_folder, cl_device_ind=cl_device_ind)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/lib/model_fitting.py", line 162, in get_optimization_inits
    return get_init_data(model_name)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/lib/model_fitting.py", line 102, in get_init_data
    noddi_results = get_model_fit('NODDI')
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/lib/model_fitting.py", line 70, in get_model_fit
    initialization_data={'inits': get_init_data(model_name)}).run()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/lib/model_fitting.py", line 336, in run
    _, maps = self._run(self._model, self._recalculate, self._only_recalculate_last)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/lib/model_fitting.py", line 382, in _run
    apply_user_provided_initialization=not _in_recursion)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/lib/model_fitting.py", line 393, in _run_composite_model
    optimizer_options=self._optimizer_options)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/lib/model_fitting.py", line 465, in fit_composite_model
    return processing_strategy.process(worker)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/lib/processing_strategies.py", line 75, in process
    self._process_chunk(processor, chunks)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/lib/processing_strategies.py", line 120, in _process_chunk
    process()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/lib/processing_strategies.py", line 117, in process
    processor.process(chunk, next_indices=next_chunk)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/lib/processing_strategies.py", line 293, in process
    self._process(roi_indices, next_indices=next_indices)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/lib/processing_strategies.py", line 467, in _process
    x0 = codec.encode(self._model.get_initial_parameters(), self._model.get_kernel_data())
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/model_building/utils.py", line 220, in encode
    parameters, kernel_data, cl_runtime_info=cl_runtime_info)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mdt/model_building/utils.py", line 257, in _transform_parameters
    cl_named_func.evaluate(kernel_data, parameters.shape[0], cl_runtime_info=cl_runtime_info)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mot/lib/cl_function.py", line 248, in evaluate
    use_local_reduction=use_local_reduction, cl_runtime_info=cl_runtime_info)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mot/lib/cl_function.py", line 608, in apply_cl_function
    cl_function, kernel_data, cl_runtime_info.double_precision, use_local_reduction)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mot/lib/cl_function.py", line 653, in __init__
    self._kernel = self._build_kernel(self._get_kernel_source(), compile_flags)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/mot/lib/cl_function.py", line 712, in _build_kernel
    return cl.Program(self._cl_context, kernel_source).build(' '.join(compile_flags))
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pyopencl/__init__.py", line 510, in build
    options_bytes=options_bytes, source=self._source)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pyopencl/__init__.py", line 554, in _build_and_catch_errors
    raise err
pyopencl._cl.RuntimeError: clBuildProgram failed: BUILD_PROGRAM_FAILURE - clBuildProgram failed: BUILD_PROGRAM_FAILURE - clBuildProgram failed: BUILD_PROGRAM_FAILURE

Build on <pyopencl.Device 'Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz' on 'Apple' at 0xffffffff>:

(options: -cl-denorms-are-zero -cl-mad-enable -cl-no-signed-zeros -I /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pyopencl/cl)
(source saved as /var/folders/t9/6js6fq5x7dnc63vlsx0y_rv00000gn/T/tmpho7wok4g.cl)

Sorry, I am no expert in Python so I cannot make any sense of any that, but I suspect it may be something to do with PYOPEN.CL ..? I would really appreciate your help or any suggestions that I might be able execute/test.

Thanks!

robbert-harms commented 5 years ago

Hi Elijah,

Thank you for your compliments and for considering the use of MDT.

We have had issues with Apple machines before, although some users claimed to get MDT to work by updating to the very latest OSX version. Even then, only the CPU worked.

The problem is unfortunately within the OpenCL drivers of OSX. Last year Apple declared they will no longer support OpenCL, with no replacement offered (see last item at https://developer.apple.com/macos/whats-new/ ).

There exists one open source solution that offers OpenCL on all platforms, including OSX (http://portablecl.org/), but this software is still in beta and fails to compile most of the code from MDT. I am monitoring their progress though, hoping that future versions work better with MDT.

All in all, I doubt MDT will work on OSX machines in the foreseeable future. Most problably we will have to drop OSX from the list of supported operating systems.

Is there any chance you have a Windows or Linux machine handy?

Apologies for the inconvenience,

Robbert

ElijahMak commented 5 years ago

Hi Robert,

Thanks for writing back! Yes indeed, I have just got access to a GPU cluster but I am still running into some problems which may not be related to the toolbox itself! Apologies for the lengthy post ahead, below I detail all the steps I have done.

1) Installation procedure on my account. SSH was made to the cluster using the -X option.

module load Anaconda5/py3.6-5.2.0
python -m venv ~/mdt
source ~/mdt/bin/activate
pip install mdt

2) Checking installation mdt-list-devices

Device 0: GPU - GRID K1 (NVIDIA CUDA) Device 1: GPU - GRID K1 (NVIDIA CUDA) Device 2: GPU - GRID K1 (NVIDIA CUDA) Device 3: GPU - GRID K1 (NVIDIA CUDA) Device 4: GPU - GRID K1 (NVIDIA CUDA) Device 5: GPU - GRID K1 (NVIDIA CUDA) Device 6: GPU - GRID K1 (NVIDIA CUDA) Device 7: GPU - GRID K1 (NVIDIA CUDA)

mdt-gui &

[1] 125777 (mdt) -bash-4.2$ QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-fkm24' /home/fkm24/mdt/bin/python3: symbol lookup error: /home/fkm24/mdt/lib/python3.6/site-packages/PyQt5/Qt/plugins/platforms/../../lib/libQt5XcbQpa.so.5: undefined symbol: FT_Get_Font_Format

However, mdt-batch-fit seems to run as I was able to see this output.

3) Command mdt-batch-fit

usage: mdt-batch-fit [-h] [-o OUTPUT_FOLDER] [-b {DirPerSubject,HCP_MGH,HCP_WUMINN}] [--cl-device-ind [{0,1,2,3,4,5,6,7} [{0,1,2,3,4,5,6,7} ...]]] [--recalculate] [--no-recalculate] [--use-gradient-deviations] [--no-gradient-deviations] [--double] [--float] [--subjects-index [SUBJECTS_INDEX [SUBJECTS_INDEX ...]]] [--subjects-id [SUBJECTS_ID [SUBJECTS_ID ...]]] [--dry-run] [--tmp-results-dir TMP_RESULTS_DIR] data_folder [models_to_fit [models_to_fit ...]] mdt-batch-fit: error: the following arguments are required: data_folder, models_to_fit

However, when I ran mdt-batch-fit on the machine using X2GO. I received a different error message image

I have also contacted our system administrator for help, but if you spot anything obvious, I would really appreciate it :) Thanks!

robbert-harms commented 5 years ago

Hi Elijah,

Interesting error. If mdt-list-devices works, then so should mdt-batch-fit.

The problem seems to be that PyOpenCL fails to identify the CUDA devices.

Could you try running the following code in a Python shell:

import pyopencl as cl
cl.get_platforms()

This simulates the last line of code in your error message. This will tell us more about where to look for the solution.

About your other problem, when running "mdt-gui &", this seems to be an QT library problem. The issue seems similar to the issue described here https://forum.level1techs.com/t/debian-sid-and-kde/118700, and here https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=873792.

Could you try running "ldd -r /home/fkm24/mdt/lib/python3.6/site-packages/PyQt5/Qt/plugins/platforms/../../lib/libQt5XcbQpa.so.5 | grep libfreetype". This should give us a clue to which libfreetype QT is linking.

Sorry to see that it is so difficult to get MDT up and running. I hope you will bear with me on this.

Best,

Robbert

ElijahMak commented 5 years ago

Hi Robbert,

Thanks. I am wondering if I may have done something wrongly during the set up of MDT? It is being performed in a virtual environment ( not a module on our cluster yet).

module load miniconda3/4.5.1
python -m venv ~/mdt
source ~/mdt/bin/activate
pip install mdt

image mdt-list-devices image python Python 3.6.2 (default, Jul 31 2018, 15:41:04) [GCC 5.4.0] on linux

import pyopencl as cl
 cl.get_platforms()

image

Next, I ran "ldd -r /home/fkm24/mdt/lib/python3.6/site-packages/PyQt5/Qt/plugins/platforms/../../lib/libQt5XcbQpa.so.5 | grep libfreetype"

and this is the output: libfreetype.so.6 => /lib64/libfreetype.so.6 (0x00002ab71f405000) undefined symbol: FT_Get_Font_Format (/home/fkm24/mdt/lib/python3.6/site-packages/PyQt5/Qt/plugins/platforms/../../lib/libQt5XcbQpa.so.5)

robbert-harms commented 5 years ago

Hi Elijah,

Let's work on the first problem first. I am sort of glad to see that "cl.get_platforms()" failed, because that is sometimes easy to fix.

For some reason, cl.get_platforms() can not find any CUDA card. This is often the case with older nvidia drivers. Could you try running the bash command nvidia-smi? This provides diagnostic information about the driver and available devices. I am particularly interested in the first few lines where is states something similar to:

Fri Dec 25 16:49:12 2015
+------------------------------------------------------+
| NVIDIA-SMI 352.63 Driver Version: 352.63 |
|-------------------------------+----------------------+

Can you forward me the driver version?

Best,

Robbert

ElijahMak commented 5 years ago

Yes I think we might be getting closer!

nvidia-smi NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

robbert-harms commented 5 years ago

Ah, that might be it.

Do you have admin rights to install the latest drivers? Most probably this is something that needs to be done by a sysadmin.

ElijahMak commented 5 years ago

Hi Robbert,

Thanks. I have now managed to get it work a bit more.

It had no problem with the BallStick_r1 model but it crashed at the NODDI cascade. Appreciate all your patience with this.

...
 File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 242, in __call__
    model_fit.run()
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 336, in run
    _, maps = self._run(self._model, self._recalculate, self._only_recalculate_last)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 382, in _run
    apply_user_provided_initialization=not _in_recursion)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 393, in _run_composite_model
    optimizer_options=self._optimizer_options)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 465, in fit_composite_model
    return processing_strategy.process(worker)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 75, in process
    self._process_chunk(processor, chunks)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 120, in _process_chunk
    process()
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 117, in process
    processor.process(chunk, next_indices=next_chunk)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 293, in process
    self._process(roi_indices, next_indices=next_indices)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 483, in _process
    options=self._optimizer_options)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/optimize/__init__.py", line 109, in minimize
    constraints_func=constraints_func, data=data, options=options)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/optimize/__init__.py", line 293, in _minimize_powell
    cl_runtime_info=cl_runtime_info)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/lib/cl_function.py", line 248, in evaluate
    use_local_reduction=use_local_reduction, cl_runtime_info=cl_runtime_info)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/lib/cl_function.py", line 627, in apply_cl_function
    total_offset = enqueue_batch(batch_end - batch_start, total_offset)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/lib/cl_function.py", line 621, in enqueue_batch
    worker.cl_queue.finish()
pyopencl._cl.LogicError: clFinish failed: INVALID_COMMAND_QUEUE
robbert-harms commented 5 years ago

Hi Elijah,

No problem, these kind of questions allow me to make MDT even better, so they are much appreciated.

I have seen this kind of error message before, it is sometimes related to memory consumption. Perhaps running the computations with a smaller number of voxels per batch might work.

Can you create a file named mdt.conf in the folder ~/.mdt/<version>/ (next to mdt.default.conf). In that file put the contents:

processing_strategies:
    optimization:
        max_nmr_voxels: 1000

Afterwards run your NODDI fit again. Let me know how you fare.

Best,

Robbert

ElijahMak commented 5 years ago

Hi Robbert,

Thanks. The BallStick_r1 completed in 14:20 mins and the NODDI was able to start running too. But alas it crashed with the same error message at around 4%. If I reran the fitting, the NODDI continued for another 2 mins before crashing again. It does sound like a memory problem indeed. Do you have an estimate about the memory that I should be requesting on my cluster?

2019-03-25 14:24:41,376] [INFO] [mdt.lib.processing_strategies] [_process_chunk] - Computations are at 4.14%, processing next 1000 voxels (217537 voxels in total, 9000 processed). Time spent: 0:00:00:20, time left: 0:01:12:27 (d:h:m:s).
[2019-03-25 14:25:00,664] [INFO] [mdt.lib.processing_strategies] [_process_chunk] - Computations are at 4.60%, processing next 1000 voxels (217537 voxels in total, 10000 processed). Time spent: 0:00:00:40, time left: 0:01:09:24 (d:h:m:s).
Traceback (most recent call last):
  File "/home/fkm24/mdt/bin/mdt-batch-fit", line 11, in <module>
    sys.exit(BatchFit.console_script())
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/shell_utils.py", line 47, in console_script
    cls().start(sys.argv[1:])
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/shell_utils.py", line 66, in start
    self.run(args, {})
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/cli_scripts/mdt_batch_fit.py", line 134, in run
    use_gradient_deviations=args.use_gradient_deviations)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/__init__.py", line 402, in batch_fit
    return batch_apply(data_folder, batch_fit_func, batch_profile=batch_profile, subjects_selection=subjects_selection)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/batch_utils.py", line 503, in batch_apply
    results[subject.subject_id] = f(subject)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/batch_utils.py", line 501, in f
    return func(subject)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 242, in __call__
    model_fit.run()
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 336, in run
    _, maps = self._run(self._model, self._recalculate, self._only_recalculate_last)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 382, in _run
    apply_user_provided_initialization=not _in_recursion)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 393, in _run_composite_model
    optimizer_options=self._optimizer_options)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 465, in fit_composite_model
    return processing_strategy.process(worker)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 75, in process
    self._process_chunk(processor, chunks)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 124, in _process_chunk
    process()
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 117, in process
    processor.process(chunk, next_indices=next_chunk)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 293, in process
    self._process(roi_indices, next_indices=next_indices)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 483, in _process
    options=self._optimizer_options)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/optimize/__init__.py", line 109, in minimize
    constraints_func=constraints_func, data=data, options=options)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/optimize/__init__.py", line 293, in _minimize_powell
    cl_runtime_info=cl_runtime_info)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/lib/cl_function.py", line 248, in evaluate
    use_local_reduction=use_local_reduction, cl_runtime_info=cl_runtime_info)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/lib/cl_function.py", line 627, in apply_cl_function
    total_offset = enqueue_batch(batch_end - batch_start, total_offset)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/lib/cl_function.py", line 621, in enqueue_batch
    worker.cl_queue.finish()
pyopencl._cl.LogicError: clFinish failed: INVALID_COMMAND_QUEUE
ElijahMak commented 5 years ago

Hmm, I have just requested for more memory on the cluster (10GB) but I still ran into the same issue.

(mdt) fkm24@wbic-gpu-n1:~/scratch/MDT_TEST/mdt_example_data$ mdt-batch-fit . NODDI --batch_profile DirPerSubject
[2019-03-25 14:33:29,485] [INFO] [mdt] [batch_fit] - Using MDT version 0.20.3
[2019-03-25 14:33:29,485] [INFO] [mdt] [batch_fit] - Using batch profile: DirPerSubject
[2019-03-25 14:33:29,493] [INFO] [mdt] [batch_fit] - Fitting models: ['NODDI']
[2019-03-25 14:33:29,493] [INFO] [mdt] [batch_fit] - Subjects found: 1
[2019-03-25 14:33:29,493] [INFO] [mdt] [batch_fit] - Subjects to process: 1
[2019-03-25 14:33:29,496] [INFO] [mdt.lib.model_fitting] [__call__] - Going to process subject NODDI, (1 of 1, we are at 0.00%)
[2019-03-25 14:33:29,607] [INFO] [mdt.lib.model_fitting] [__call__] - Loading the data (DWI, mask and protocol) of subject NODDI
[2019-03-25 14:33:29,742] [INFO] [mdt.lib.model_fitting] [__call__] - Going to fit model NODDI on subject NODDI
[2019-03-25 14:33:29,852] [INFO] [mdt.lib.model_fitting] [get_model_fit] - Starting intermediate optimization for generating initialization point.
[2019-03-25 14:33:29,938] [INFO] [mdt.lib.model_fitting] [_apply_user_provided_initialization_data] - Preparing model BallStick_r1 with the user provided initialization data.
[2019-03-25 14:33:29,995] [INFO] [mdt.lib.model_fitting] [fit_composite_model] - Not recalculating BallStick_r1 model
[2019-03-25 14:33:30,036] [INFO] [mdt.lib.model_fitting] [get_model_fit] - Finished intermediate optimization for generating initialization point.
[2019-03-25 14:33:30,443] [INFO] [mdt.lib.model_fitting] [_apply_user_provided_initialization_data] - Preparing model NODDI with the user provided initialization data.
[2019-03-25 14:33:30,486] [INFO] [mdt.lib.model_fitting] [fit_composite_model] - Using MDT version 0.20.3
[2019-03-25 14:33:30,487] [INFO] [mdt.lib.model_fitting] [fit_composite_model] - Preparing for model NODDI
[2019-03-25 14:33:30,487] [INFO] [mdt.lib.model_fitting] [fit_composite_model] - Current cascade: ['NODDI']
[2019-03-25 14:33:30,905] [INFO] [mdt.models.composite] [_prepare_input_data] - No volume options to apply, using all 64 volumes.
[2019-03-25 14:33:30,906] [INFO] [mdt.utils] [estimate_noise_std] - Trying to estimate a noise std.
[2019-03-25 14:33:30,906] [WARNING] [mdt.utils] [_compute_noise_std] - Failed to obtain a noise std for this subject. We will continue with an std of 1.
[2019-03-25 14:33:30,909] [INFO] [mdt.lib.model_fitting] [_model_fit_logging] - Fitting NODDI model
[2019-03-25 14:33:30,909] [INFO] [mdt.lib.model_fitting] [_model_fit_logging] - The 6 parameters we will fit are: ['S0.s0', 'w_ic.w', 'NODDI_IC.theta', 'NODDI_IC.phi', 'NODDI_IC.kappa', 'w_ec.w']
[2019-03-25 14:33:30,909] [INFO] [mdt.lib.model_fitting] [fit_composite_model] - Saving temporary results in /lustre/scratch/wbic-beta/fkm24/MDT_TEST/mdt_example_data_output/NODDI/NODDI/tmp_results.
[2019-03-25 14:33:30,953] [INFO] [mdt.lib.processing_strategies] [_process_chunk] - Computations are at 4.60%, processing next 1000 voxels (217537 voxels in total, 10000 processed). Time spent: 0:00:00:00, time left: ? (d:h:m:s).
[2019-03-25 14:33:30,979] [INFO] [mdt.lib.processing_strategies] [_process] - Starting optimization
[2019-03-25 14:33:30,979] [INFO] [mdt.lib.processing_strategies] [_process] - Using MOT version 0.9.1
[2019-03-25 14:33:30,979] [INFO] [mdt.lib.processing_strategies] [_process] - We will use a single precision float type for the calculations.
[2019-03-25 14:33:30,979] [INFO] [mdt.lib.processing_strategies] [_process] - Using device 'GPU - GRID K1 (NVIDIA CUDA)'.
[2019-03-25 14:33:30,979] [INFO] [mdt.lib.processing_strategies] [_process] - Using compile flags: ['-cl-denorms-are-zero', '-cl-mad-enable', '-cl-no-signed-zeros']
[2019-03-25 14:33:30,979] [INFO] [mdt.lib.processing_strategies] [_process] - We will use the optimizer Powell with default settings.
Traceback (most recent call last):
  File "/home/fkm24/mdt/bin/mdt-batch-fit", line 11, in <module>
    sys.exit(BatchFit.console_script())
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/shell_utils.py", line 47, in console_script
    cls().start(sys.argv[1:])
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/shell_utils.py", line 66, in start
    self.run(args, {})
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/cli_scripts/mdt_batch_fit.py", line 134, in run
    use_gradient_deviations=args.use_gradient_deviations)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/__init__.py", line 402, in batch_fit
    return batch_apply(data_folder, batch_fit_func, batch_profile=batch_profile, subjects_selection=subjects_selection)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/batch_utils.py", line 503, in batch_apply
    results[subject.subject_id] = f(subject)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/batch_utils.py", line 501, in f
    return func(subject)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 242, in __call__
    model_fit.run()
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 336, in run
    _, maps = self._run(self._model, self._recalculate, self._only_recalculate_last)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 382, in _run
    apply_user_provided_initialization=not _in_recursion)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 393, in _run_composite_model
    optimizer_options=self._optimizer_options)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/model_fitting.py", line 465, in fit_composite_model
    return processing_strategy.process(worker)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 75, in process
    self._process_chunk(processor, chunks)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 120, in _process_chunk
    process()
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 117, in process
    processor.process(chunk, next_indices=next_chunk)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 293, in process
    self._process(roi_indices, next_indices=next_indices)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mdt/lib/processing_strategies.py", line 483, in _process
    options=self._optimizer_options)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/optimize/__init__.py", line 109, in minimize
    constraints_func=constraints_func, data=data, options=options)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/optimize/__init__.py", line 293, in _minimize_powell
    cl_runtime_info=cl_runtime_info)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/lib/cl_function.py", line 248, in evaluate
    use_local_reduction=use_local_reduction, cl_runtime_info=cl_runtime_info)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/lib/cl_function.py", line 627, in apply_cl_function
    total_offset = enqueue_batch(batch_end - batch_start, total_offset)
  File "/home/fkm24/mdt/lib/python3.6/site-packages/mot/lib/cl_function.py", line 621, in enqueue_batch
    worker.cl_queue.finish()
pyopencl._cl.LogicError: clFinish failed: INVALID_COMMAND_QUEUE
robbert-harms commented 5 years ago

Hi Elijah,

I see from your previous message that you are applying MDT on graphics cards. You are then limited by the available GPU memory, not so much by the available system memory.

Since your GPU's are quite old, I would suggest to run the computations on the CPU's instead of on the GPU's. Your cluster is most likely equipped with Intel processors. These can also be utilized for MDT computations with the Intel OpenCL drivers installed. If you, or your sysadmin can install these drivers (https://software.intel.com/en-us/articles/opencl-drivers) you can also use the CPU's for MDT.

MDT uses quite recent technology for the parameter estimations. This can cause problems on older machines. I hope the CPU option can help you, as it seems a last resort.

Best,

Robbert

ElijahMak commented 5 years ago

Hi Robbert,

Great news. Tweaking the max number of voxels to 500 finally got MDT to work successfully. The total time for NODDI modelling is approx 1.5 hours, which is a fantastic boost in speed.

[2019-03-26 15:55:46,261] [INFO] [mdt.lib.processing_strategies] [_process_chunk] - Computations are at 99.98%, processing next 37 voxels (217537 voxels in total, 217500 processed). Time spent: 0:01:21:59, time left: 0:00:00:00 (d:h:m:s).
[2019-03-26 15:55:51,397] [INFO] [mdt.lib.processing_strategies] [_process_chunk] - Computations are at 100%
[2019-03-26 15:55:51,411] [INFO] [mdt.lib.processing_strategies] [process] - Computed all voxels, now creating nifti's
[2019-03-26 15:55:59,819] [INFO] [mdt.lib.model_fitting] [_model_fit_logging] - Fitted NODDI model with runtime 0:01:22:12 (d:h:m:s).

I have also attached a screenshot showing the subject's ODI, which seems very reasonable and in high agreement with the corresponding ODI map obtained from the NODDI Matlab Toolbox.

image

One more question though, is there a section in the NODDI code where we could modify the intrinsic diffusivity parameters for GM and WM?

Thanks so much for all your help and patience!

Best Wishes, Elijah

robbert-harms commented 5 years ago

Hi Elijah,

Great news! I am happy that you got it to work. A runtime of 1.5 hours is not bad, I have seen faster, but this seems very hardware dependent.

There are two methods you can use to adapt the NODDI model. One is by redefining the NODDI model, the other is by setting the fixation values just prior to model fitting. The advantage of adapting the NODDI model is that it is easier and allows using that model in the MDT GUI and command line. The disadvantage is that you can only set the diffusivities to scalar values. The advantage of using the Python API is flexibility as this allows you to specify an intrinsic diffusivity map, i.e. fix the diffusivities voxel-wise.

Here are some examples for both. First, an example of adapting the NODDI model. Create a new file in the folder ~/.mdt/<version>/components/user/ named my_noddi.py. You can put it in a subfolder but that is not strictly necessary. In the file put the contents:

import mdt
from mdt.component_templates.base import merge_dict

class NODDI_GM(mdt.get_template('composite_models', 'NODDI')):
    fixes = merge_dict({
        'NODDI_IC.d': 1.7e-9,
        'NODDI_EC.d': 1.7e-9,
        'Ball.d': 3.0e-9,
    })

This is scripted such that it only overwrites the three intrinsic diffusivities. For the rest it uses the existing MDT NODDI model.

The other option is by using the Python API. For this I would suggest following the user manual at https://mdt-toolbox.readthedocs.io/en/latest_release/mle_fitting.html#full-example . Then, similar to the point at https://mdt-toolbox.readthedocs.io/en/latest_release/mle_fitting.html#fixing-parameters you can fix diffusivity parameters. As a template, the following should work (fill in your own data):

import mdt

input_data = mdt.load_input_data(
    'b1k_b2k_example_slices_24_38',
    'b1k_b2k.prtcl',
    'b1k_b2k_example_slices_24_38_mask')

mdt.fit_model(
    'NODDI', input_data, 'output',
    initialization_data={
        'fixes': {
            'NODDI_IC.d': 1.7e-9,
            'NODDI_EC.d': 1.7e-9,
            'Ball.d': 3.0e-9
        }
    },
)

With the latter method you can also set the fixation values to voxel-wise brain maps. This could be useful to directly fit white and gray matter voxels in one go. Providing a map will increase GPU memory though.

I hope this gets you started, let me know how it goes.

Best,

Robbert

Edited: replaced My_NODDI with NODDI_GM

ElijahMak commented 5 years ago

Hi Robbert,

Thanks for the help again!

I hope I understood your instructions correctly; it seems as though we can only change the overall diffusivities globally? For my project, I am most interested in modifying the diffusivities specifically for the gray matter while keeping it the same for white matter.

Is there a way to achieve this using the first method?

Best Wishes, Elijah

On 26 Mar 2019, at 16:46, Robbert Harms notifications@github.com wrote:

Hi Elijah,

Great news! I am happy that you got it to work. A runtime of 1.5 hours is not bad, I have seen faster, but this seems very hardware dependent.

There are two methods you can use to adapt the NODDI model. One is by redefining the NODDI model, the other is by setting the fixation values just prior to model fitting. The advantage of adapting the NODDI model is that it is easier and allows using that model in the MDT GUI and command line. The disadvantage is that you can only set the diffusivities to scalar values. The advantage of using the Python API is flexibility as this allows you to specify an intrinsic diffusivity map, i.e. fix the diffusivities voxel-wise.

Here are some examples for both. First, an example of adapting the NODDI model. Create a new file in the folder ~/.mdt//components/user/ named my_noddi.py. You can put it in a subfolder but that is not strictly necessary. In the file put the contents:

import mdt from mdt.component_templates.base import merge_dict

class My_NODDI(mdt.get_template('composite_models', 'NODDI')): fixes = merge_dict({ 'NODDI_IC.d': 1.7e-9, 'NODDI_EC.d': 1.7e-9, 'Ball.d': 3.0e-9, }) This is scripted such that it only overwrites the three intrinsic diffusivities. For the rest it uses the existing MDT NODDI model.

The other option is by using the Python API. For this I would suggest following the user manual at https://mdt-toolbox.readthedocs.io/en/latest_release/mle_fitting.html#full-example https://mdt-toolbox.readthedocs.io/en/latest_release/mle_fitting.html#full-example . Then, similar to the point at https://mdt-toolbox.readthedocs.io/en/latest_release/mle_fitting.html#fixing-parameters https://mdt-toolbox.readthedocs.io/en/latest_release/mle_fitting.html#fixing-parameters you can fix diffusivity parameters. As a template, the following should work (fill in your own data):

import mdt

input_data = mdt.load_input_data( 'b1k_b2k_example_slices_24_38', 'b1k_b2k.prtcl', 'b1k_b2k_example_slices_24_38_mask')

mdt.fit_model( 'NODDI', input_data, 'output', initialization_data={ 'fixes': { 'NODDI_IC.d': 5e-9, 'NODDI_EC.d': 1.7e-9, 'Ball.d': 3.0e-9 } }, ) With the latter method you can also set the fixation values to voxel-wise brain maps. This could be useful to directly fit white and gray matter voxels in one go. Providing a map will increase GPU memory though.

I hope this gets you started, let me know how it goes.

Best,

Robbert

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cbclab/MDT/issues/12#issuecomment-476740413, or mute the thread https://github.com/notifications/unsubscribe-auth/ATv16nLldjen8fEl45qCF0lAxn1LBnShks5vak7TgaJpZM4b8jrb.

robbert-harms commented 5 years ago

Hi Elijah,

it seems as though we can only change the overall diffusivities globally?

Not necessarily. If you add a new model then yes, you can only set a global diffusivity, i.e. a single diffusivity for all the voxels. On the other hand if you use the Python API you can set fix the diffusivity to a different value per voxel.

Suppose you are fitting a brain dataset with (x, y, z) dimensions of (100, 100, 50). What you now could do is make a new map with the same dimensions and set the diffusivities to 3.0e-9 for gray matter and 1.7e-9 for white matter. In pseudocode:

diffusivities = numpy.zeros(100, 100, 50)
diffusivities[white_matter_voxels] = 1.7e-9
diffusivities[gray_matter_voxels] = 3.0e-9

This map can now be used during the model fitting as such:

mdt.fit_model(
    ...
    initialization_data={
        'fixes': {
            'NODDI_IC.d': diffusivities,
            ...
        }
    }
)

As such, MDT will use the provided map to look up the NODDI IC diffusivity per voxel.

The reason this is not available using the first method (i.e. in the model definition) is that it makes the model definition dependent on the shape of the input data.

Does this answer your question or am I on a completely wrong track?

Best wishes,

Robbert

robbert-harms commented 5 years ago

Closing due to inactivity. If the problem persists, please reopen.

ElijahMak commented 5 years ago

Hi Robbert,

Apologies for the delay in getting back to you on this. I am still unsure as to how to designate the set of GM.

However, I thought of a rather "crude" workaround. May I simply change the ICs using the first step here and refit NODDI in a separate output folder, e.g. "mdt_greymatter" such that the outputs here will be used strictly for cortical NODDI analyses?

The value of 1.1 is determined from previous publications.

import mdt
from mdt.component_templates.base import merge_dict

class My_NODDI(mdt.get_template('composite_models', 'NODDI')):
    fixes = merge_dict({
        'NODDI_IC.d: 1.1e-9,
        'NODDI_EC.d': 1.7e-9,
        'Ball.d': 3.0e-9,
    })

Finally, how can I check whether mdt-batch-fit . NODDI is definitely taking the updated values into the calculations?

Thanks very much. I have already gotten some exciting data from the WM analyses so I am hopeful about this.

robbert-harms commented 5 years ago

Hi Elijah,

This would definitely work. In this way you could analyze the brain data twice, once with the cortical diffusion settings and once with the WM diffusion settings. Afterwards you could overlay/combine the two. It is a bit more time consuming, but it would work.

I would recommend to change the model name such that it starts with "NODDI". This is important for the internal initialization methods. I realize I gave you that example in the first place, apologies for the confusion. This would work better:

import mdt
from mdt.component_templates.base import merge_dict

class NODDI_GM(mdt.get_template('composite_models', 'NODDI')):
    fixes = merge_dict({
        'NODDI_IC.d: 1.1e-9,
        'NODDI_EC.d': 1.7e-9,
        'Ball.d': 3.0e-9,
    })

This snippet defines the model "NODDI_GM". On the command line you can then use this model as

mdt-batch-fit . NODDI_GM

That is, the model is named NODDI_GM, which for MDT is a new model. You could even run both models at the same time with the command:

mdt-batch-fit . NODDI NODDI_GM

This would run both NODDI and your NODDI_GM model. The output will also be saved in two folders, one folder for the NODDI model and one for the NODDI_GM model.

Let me know if this works for you.

ElijahMak commented 5 years ago

Hi Robbert,

Fantastic. Yes, it works brilliantly. Thanks very much for the help.

As an aside, I have also compared the MDT’s ODI maps against those derived from the NODDI Matlab Toolbox and they look very similar except for the CSF / ventricular regions (i.e. the ODI values are noticeably lower (as one might expect) in the MDT maps.

Best Wishes, Elijah

On 23 Apr 2019, at 10:08, Robbert Harms notifications@github.com wrote:

Hi Elijah,

This would definitely work. In this way you could analyze the brain data twice, once with the cortical diffusion settings and once with the WM diffusion settings. Afterwards you could overlay/combine the two. It is a bit more time consuming, but it would work.

I would recommend to change the model name such that it starts with "NODDI". This is important for the internal initialization methods. I realize I gave you that example in the first place, apologies for the confusion. This would work better:

import mdt from mdt.component_templates.base import merge_dict

class NODDI_GM(mdt.get_template('composite_models', 'NODDI')): fixes = merge_dict({ 'NODDI_IC.d: 1.1e-9, 'NODDI_EC.d': 1.7e-9, 'Ball.d': 3.0e-9, }) This snippet defines the model "NODDI_GM". On the command line you can then use this model as

mdt-batch-fit . NODDI_GM

That is, the model is named NODDI_GM, which for MDT is a new model. You could even run both models at the same time with the command:

mdt-batch-fit . NODDI NODDI_GM

This would run both NODDI and your NODDI_GM model. The output will also be saved in two folders, one folder for the NODDI model and one for the NODDI_GM model.

Let me know if this works for you.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cbclab/MDT/issues/12#issuecomment-485717495, or mute the thread https://github.com/notifications/unsubscribe-auth/AE57L2XCICTS56X23V2TEIDPR3GZLANCNFSM4G7SHLNQ.

robbert-harms commented 5 years ago

Hi Elijah,

Good to hear!

About your other question. This is a known phenomenon with the NODDI model. In regions of high CSF, the intra-cellular (IC) and the extra-cellular (EC) compartment contribute only little to the expected signal. As such, the dispersion of the IC and EC compartments becomes an indeterminable parameter. There is just not enough signal left to properly quantify the dispersion.

The differences in values between the Matlab NODDI and MDT NODDI can then be attributed to small differences in model implementation, optimization routine etc.

I hope this helps, let me know if it is unclear.