robbert-harms / MDT

Microstructure Diffusion Toolbox
GNU Lesser General Public License v3.0
50 stars 18 forks source link

Running test file #54

Closed stillill closed 1 year ago

stillill commented 1 year ago

Hello,

I'm trying to run the test_example_data.py file under the tests folder and was wondering how long this test should take to run? I've tried running that file a few times (never to completion) but every time I run it, MDT stays hung at this point for a few hours at which point I kill the process because I can't tell if the process is simply hung or not--

Singularity> python3 test_example_data.py [2023-10-19 13:12:55,642] [INFO] [mdt] [fit_model] - Preparing BallStick_r1 with the cascaded initializations. [2023-10-19 13:12:55,644] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Using MDT version 1.2.6 [2023-10-19 13:12:55,644] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Preparing for model BallStick_r1 [2023-10-19 13:12:55,660] [INFO] [mdt.models.composite] [_prepare_input_data] - No volume options to apply, using all 103 volumes. [2023-10-19 13:12:55,660] [INFO] [mdt.utils] [estimate_noise_std] - Trying to estimate a noise std. [2023-10-19 13:12:55,662] [INFO] [mdt.utils] [estimate_noise_std] - Estimated global noise std 19.613178253173828. [2023-10-19 13:12:55,662] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - Fitting BallStick_r1 model [2023-10-19 13:12:55,662] [INFO] [mdt.lib.processing.model_fitting] [_model_fit_logging] - The 4 parameters we will fit are: ['S0.s0', 'w_stick0.w', 'Stick0.theta', 'Stick0.phi'] [2023-10-19 13:12:55,662] [INFO] [mdt.lib.processing.model_fitting] [fit_composite_model] - Saving temporary results in /tmp/tmpyxp3i3bfmdt_example_data_test/mdt_example_data/b1k_b2k/output/b1k_b2k_example_slices_24_38_mask/BallStick_r1/tmp_results. /usr/lib/python3/dist-packages/mot/lib/utils.py:148: DeprecationWarning: pyopencl.array.vec is deprecated. Please use pyopencl.cltypes for OpenCL vector and scalar types return getattr(cl_array.vec, vector_type) [2023-10-19 13:12:55,785] [INFO] [mdt.lib.processing.processing_strategies] [_process_chunk] - Computations are at 0.00%, processing next 8865 voxels (8865 voxels in total, 0 processed). Time spent: 0:00:00:00, time left: ? (d:h:m:s). [2023-10-19 13:12:55,785] [INFO] [mdt.lib.processing.model_fitting] [_process] - Starting optimization [2023-10-19 13:12:55,785] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using MOT version 0.11.3 [2023-10-19 13:12:55,785] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use a single precision float type for the calculations. [2023-10-19 13:12:55,786] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using device 'GPU - NVIDIA A100-PCIE-40GB (NVIDIA CUDA)'. [2023-10-19 13:12:55,786] [INFO] [mdt.lib.processing.model_fitting] [_process] - Using compile flags: ('-cl-denorms-are-zero', '-cl-mad-enable', '-cl-no-signed-zeros') [2023-10-19 13:12:55,786] [INFO] [mdt.lib.processing.model_fitting] [_process] - We will use the optimizer Powell with optimizer settings {'patience': 2} /usr/lib/python3/dist-packages/pytools/py_codegen.py:146: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp

Thanks!

robbert-harms commented 1 year ago

Hi Stillill,

At that step, where it says "We will use the optimizer..." is where it is compiling the kernels to be executed on the GPU. This is the step that may sometimes hang. Unfortunately there is not much I can do about it, some drivers compile the code fine, other drivers (or driver versions) hang at kernel compilation. You could try a different driver version, see if that helps.

As a long-term solution, I would like to partly rewrite the kernel compilation to use SPIR-V as intermediate compilation step. Unfortunately time is limited.

Best,

Robbert

stillill commented 1 year ago

Hi Robbert,

Ok, thanks for the guidance. I'll try installing different OpenCL drivers into the container and see if I have any luck. The CPU version of MDT works fine for us, but it would be nice to get it working on the GPU as well. Thanks again!