pnlbwh / ukftractography

Other
25 stars 27 forks source link

Problem with NODDI option on MACOS #127

Open magictodd opened 4 years ago

magictodd commented 4 years ago

We were able to get slicer version of UKF Tractography with --NODDI option to work well on Linux but with the same data, the --NODDI option will crash using MACOS version. The 2 tensor model works fine on MACOS but not the NODDI option. Can you please look into it? We used three data files - one for the dwi input, one for the label or seed input, and one for the mask input file. I can send you the three sets of nhdr files if that will will help you to debug.

tashrifbillah commented 4 years ago

Hi @magictodd , are you using UKFTractography as part of Slicer or as a command line tool?

magictodd commented 4 years ago

Both ways it crashes on MACOS but  NOT on LINUx.  Here is my command line /home/toddr/Dropbox/gulf_apr2020/Slicer-4.10.2-linux-amd64/Slicer --launch UKFTr actography --dwiFile ./dtiprep/test_dwi.nhdr --labels 1 --seedsFile ./test11.nhd r --maskFile ./testmask.nhdr --noddi --recordKappa --recordVic --recordViso --Qk appa 0.01 --Qvic 0.004 --tracts ./UKF_cereb2.vtk

On Thursday, July 9, 2020, 10:44:19 AM PDT, Tashrif Billah <notifications@github.com> wrote:  

Hi @magictodd , are you using UKFTractography as part of Slicer or as a command line tool?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

tashrifbillah commented 4 years ago

Can you provide the traceback on crash?

magictodd commented 4 years ago

Here is the output during the MACOS execution of the command line version

Using the 2T simple model. Setting the default parameters accordingly:

"*": set by user

"-": default setting

- stoppingFA: 0.15

* seedingThreshold: 0.18

- Qm: 0.001

* Qkappa: 0.01

- Rs: 0.02

* stepLength: 0.3

* recordLength: 0.9

* Qvic = Qviso: 0.004

* stoppingThreshold: 0.1

- seedsPerVoxel: 1

Found 8 cores on your system.

Running tractography with 8 thread(s).

_nrrdEncodingRaw_read: WARNING: finished reading raw data, but file not at EOF

Number of non-zero gradients: 32

Number of zero gradients: 1

Permuting the axis order to: 3 0 1 2

Resizing the data to: 32 192 192 48

Computing the baseline image

Dividing the signal by baseline image

Converting the world coordinate system to RAS

Data normalization finished!

Using NODDI 2-Fiber model.

Branching disabled

Using constrained filter

A 9.87189e+146  8.07828e+147  8.41855e+147 -1.08662e+147 -8.15854e+145 -3.15072e+147 -2.34428e+148   4.0059e+148 -5.78505e+146 -3.27748e+145   3.0741e+147

 8.07828e+147  2.69459e+140  -1.3473e+140 -2.94721e+139 -5.26287e+137  1.68412e+139 -9.43107e+140 -2.69459e+140 -3.15772e+139 -3.94716e+137             0

 8.41855e+147  -1.3473e+140   7.0872e+295 -9.14779e+294 -6.86831e+293 -2.65245e+295 -1.97354e+296  3.37238e+296 -4.87017e+294 -2.75916e+293  2.58795e+295

-1.08662e+147 -2.94721e+139 -9.14779e+294  1.18075e+294  8.86526e+292  3.42365e+294  2.54735e+295  -4.3529e+295  6.28617e+293  3.56139e+292 -3.34039e+294

-8.15854e+145 -5.26287e+137 -6.86831e+293  8.86526e+292  6.65618e+291  2.57053e+293  1.91259e+294 -3.26823e+294  4.71975e+292  2.67395e+291 -2.50802e+293

-3.15072e+147  1.68412e+139 -2.65245e+295  3.42365e+294  2.57053e+293  9.92705e+294  7.38617e+295 -1.26215e+296  1.82271e+294  1.03264e+293 -9.68564e+294

-2.34428e+148 -9.43107e+140 -1.97354e+296  2.54735e+295  1.91259e+294  7.38617e+295  5.49564e+296 -9.39093e+296  1.35618e+295  7.68332e+293 -7.20655e+295

  4.0059e+148 -2.69459e+140  3.37238e+296  -4.3529e+295 -3.26823e+294 -1.26215e+296 -9.39093e+296  1.60472e+297 -2.31743e+295 -1.31292e+294  1.23145e+296

-5.78505e+146 -3.15772e+139 -4.87017e+294  6.28617e+293  4.71975e+292  1.82271e+294  1.35618e+295 -2.31743e+295  3.34668e+293  1.89604e+292 -1.77838e+294

-3.27748e+145 -3.94716e+137 -2.75916e+293  3.56139e+292  2.67395e+291  1.03264e+293  7.68332e+293 -1.31292e+294  1.89604e+292  1.07419e+291 -1.00753e+293

libc++abi.dylib: terminating with uncaught exception of type std::logic_error: Error in cholesky decomposition, sum: -5.44562e+280

  3.0741e+147             0  2.58795e+295 -3.34039e+294 -2.50802e+293 -9.68564e+294 -7.20655e+295  1.23145e+296 -1.77838e+294 -1.00753e+293  9.45009e+294slicer_vtk_heather_withmake_nhdr_no_m.txt: line 45: 96476 Abort trap: 6           /Applications/Slicer.app/Contents/Extensions-28257/UKFTractography/lib/Slicer-4.10/cli-modules/UKFTractography --dwiFile ./dtiprep/test_dwi.nhdr --labels 1 --seedsFile ./test11.nhdr --maskFile ./dtiprep/test_dwi.nhdr --noddi --recordKappa --recordVic --recordViso --Qkappa 0.01 --Qvic 0.004 --tracts ./UKF_cereb2.vtk
tashrifbillah commented 4 years ago

I have enclosed @magictodd 's traceback in an appropriate markdown (```vim ```) so it's better legible:

Using the 2T simple model. Setting the default parameters accordingly:

"*": set by user

"-": default setting

- stoppingFA: 0.15

* seedingThreshold: 0.18

- Qm: 0.001

* Qkappa: 0.01

- Rs: 0.02

* stepLength: 0.3

* recordLength: 0.9

* Qvic = Qviso: 0.004

* stoppingThreshold: 0.1

- seedsPerVoxel: 1

Found 8 cores on your system.

Running tractography with 8 thread(s).

_nrrdEncodingRaw_read: WARNING: finished reading raw data, but file not at EOF

Number of non-zero gradients: 32

Number of zero gradients: 1

Permuting the axis order to: 3 0 1 2

Resizing the data to: 32 192 192 48

Computing the baseline image

Dividing the signal by baseline image

Converting the world coordinate system to RAS

Data normalization finished!

Using NODDI 2-Fiber model.

Branching disabled

Using constrained filter

A 9.87189e+146  8.07828e+147  8.41855e+147 -1.08662e+147 -8.15854e+145 -3.15072e+147 -2.34428e+148   4.0059e+148 -5.78505e+146 -3.27748e+145   3.0741e+147

 8.07828e+147  2.69459e+140  -1.3473e+140 -2.94721e+139 -5.26287e+137  1.68412e+139 -9.43107e+140 -2.69459e+140 -3.15772e+139 -3.94716e+137             0

 8.41855e+147  -1.3473e+140   7.0872e+295 -9.14779e+294 -6.86831e+293 -2.65245e+295 -1.97354e+296  3.37238e+296 -4.87017e+294 -2.75916e+293  2.58795e+295

-1.08662e+147 -2.94721e+139 -9.14779e+294  1.18075e+294  8.86526e+292  3.42365e+294  2.54735e+295  -4.3529e+295  6.28617e+293  3.56139e+292 -3.34039e+294

-8.15854e+145 -5.26287e+137 -6.86831e+293  8.86526e+292  6.65618e+291  2.57053e+293  1.91259e+294 -3.26823e+294  4.71975e+292  2.67395e+291 -2.50802e+293

-3.15072e+147  1.68412e+139 -2.65245e+295  3.42365e+294  2.57053e+293  9.92705e+294  7.38617e+295 -1.26215e+296  1.82271e+294  1.03264e+293 -9.68564e+294

-2.34428e+148 -9.43107e+140 -1.97354e+296  2.54735e+295  1.91259e+294  7.38617e+295  5.49564e+296 -9.39093e+296  1.35618e+295  7.68332e+293 -7.20655e+295

  4.0059e+148 -2.69459e+140  3.37238e+296  -4.3529e+295 -3.26823e+294 -1.26215e+296 -9.39093e+296  1.60472e+297 -2.31743e+295 -1.31292e+294  1.23145e+296

-5.78505e+146 -3.15772e+139 -4.87017e+294  6.28617e+293  4.71975e+292  1.82271e+294  1.35618e+295 -2.31743e+295  3.34668e+293  1.89604e+292 -1.77838e+294

-3.27748e+145 -3.94716e+137 -2.75916e+293  3.56139e+292  2.67395e+291  1.03264e+293  7.68332e+293 -1.31292e+294  1.89604e+292  1.07419e+291 -1.00753e+293

libc++abi.dylib: terminating with uncaught exception of type std::logic_error: Error in cholesky decomposition, sum: -5.44562e+280

  3.0741e+147             0  2.58795e+295 -3.34039e+294 -2.50802e+293 -9.68564e+294 -7.20655e+295  1.23145e+296 -1.77838e+294 -1.00753e+293  9.45009e+294

slicer_vtk_heather_withmake_nhdr_no_m.txt: line 45: 96476 Abort trap: 6           

/Applications/Slicer.app/Contents/Extensions-28257/UKFTractography/lib/Slicer-4.10/cli-modules/UKFTractography 
--dwiFile ./dtiprep/test_dwi.nhdr --labels 1 --seedsFile ./test11.nhdr --maskFile ./dtiprep/test_dwi.nhdr --noddi 
--recordKappa --recordVic --recordViso --Qkappa 0.01 --Qvic 0.004 --tracts ./UKF_cereb2.vtk
tashrifbillah commented 4 years ago

Hi @rmukh ,

Regarding the following line in the above traceback--

libc++abi.dylib: terminating with uncaught exception of type std::logic_error: Error in cholesky decomposition

Was there a change in cholesky decomposition recently?

rmukh commented 4 years ago

Hi @rmukh ,

Regarding the following line in the above traceback--

libc++abi.dylib: terminating with uncaught exception of type std::logic_error: Error in cholesky decomposition

Was there a change in cholesky decomposition recently?

Hi @tashrifbillah,

I don't think so. It is not related to the Eigen library since the error appears here https://github.com/pnlbwh/ukftractography/blob/2c144a32716ba1620ede0a4276f73ac931a00c42/ukf/QuadProg%2B%2B_Eigen.cc#L273 and it is a custom function.

One possible solution to try is to replace the following line https://github.com/pnlbwh/ukftractography/blob/60b29e8a702feaf0c1d9e0fb7cd6a1e62506c7c1/ukf/QuadProg%2B%2B_Eigen.cc#L500 to inbuild Eigen Cholesky decomposition, smth like:

Eigen::LLT<ukfStateSquareMatrix, Eigen::Lower> chol(G.cols());
chol.compute(G);

Moreover, if you have an opportunity to test my improved version of QuadProg++_Eigen.cc file, I can submit a pull request. Unfortunately, I am short on time currently to run all experiments myself, so I need your assistance with that in case you agree to do that. Please, let me know.

tashrifbillah commented 4 years ago

Moreover, if you have an opportunity to test my improved version of QuadProg++_Eigen.cc file

Would you like me to do that in MAC? I can do that. By the way, do you think your improved version may fix this issue?

rmukh commented 4 years ago

Moreover, if you have an opportunity to test my improved version of QuadProg++_Eigen.cc file

Would you like me to do that in MAC? I can do that. By the way, do you think your improved version may fix this issue?

On Mac, yes, for sure. I just have a chance to test it on Ubuntu only, so it might be a good idea to check it on other versions and Windows as well.

The improved version might fix the issue, but not 100% sure. For me, the error looks like a values explosion during matrix multiplication or division somewhere that causes this stability issue. Since Eigen methods are internally designed to account for those instabilities, the new version I propose may help.

rmukh commented 4 years ago

Hi @tashrifbillah,

I did send an improved QP as PR. https://github.com/pnlbwh/ukftractography/pull/128

magictodd commented 4 years ago

Okay, great, thank you!! can you please tell how to to install the new software into my slicer??

On Friday, July 10, 2020, 10:02:07 AM PDT, Rinat M <notifications@github.com> wrote:  

Hi @tashrifbillah,

I did send an improved QP as PR.

128

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

magictodd commented 4 years ago

Sorry, if you put it upto the nightly 4.11 then it still crashes. We tested it.

On Friday, July 10, 2020, 12:35:39 PM PDT, Todd Richards <todd98028@yahoo.com> wrote:  

Okay, great, thank you!! can you please tell how to to install the new software into my slicer??

On Friday, July 10, 2020, 10:02:07 AM PDT, Rinat M <notifications@github.com> wrote:  

Hi @tashrifbillah,

I did send an improved QP as PR.

128

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

rmukh commented 4 years ago

Sorry, if you put it upto the nightly 4.11 then it still crashes. We tested it. On Friday, July 10, 2020, 12:35:39 PM PDT, Todd Richards todd98028@yahoo.com wrote: Okay, great, thank you!! can you please tell how to to install the new software into my slicer?? On Friday, July 10, 2020, 10:02:07 AM PDT, Rinat M notifications@github.com wrote: Hi @tashrifbillah, I did send an improved QP as PR. #128 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Thank you for your tests. Could you please, share the error, or is it the same?

magictodd commented 4 years ago

Here is the error message

libc++abi.dylib: terminating with uncaught exception of type std::logic_error: Error in cholesky decomposition, sum: -1.48077e+125

On Saturday, July 11, 2020, 11:00:41 AM PDT, Rinat M <notifications@github.com> wrote:  

Sorry, if you put it upto the nightly 4.11 then it still crashes. We tested it. On Friday, July 10, 2020, 12:35:39 PM PDT, Todd Richards todd98028@yahoo.com wrote: Okay, great, thank you!! can you please tell how to to install the new software into my slicer?? On Friday, July 10, 2020, 10:02:07 AM PDT, Rinat M notifications@github.com wrote: Hi @tashrifbillah, I did send an improved QP as PR. #128 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Thank you for your tests. Could you please, share the error, or is it the same?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

tashrifbillah commented 4 years ago

@magictodd , just that you know, replies in the GitHub web interface are easier to read than replies through email.

rmukh commented 4 years ago

Here is the error message libc++abi.dylib: terminating with uncaught exception of type std::logic_error: Error in cholesky decomposition, sum: -1.48077e+125 On Saturday, July 11, 2020, 11:00:41 AM PDT, Rinat M notifications@github.com wrote: Sorry, if you put it upto the nightly 4.11 then it still crashes. We tested it. On Friday, July 10, 2020, 12:35:39 PM PDT, Todd Richards todd98028@yahoo.com wrote: Okay, great, thank you!! can you please tell how to to install the new software into my slicer?? On Friday, July 10, 2020, 10:02:07 AM PDT, Rinat M notifications@github.com wrote: Hi @tashrifbillah, I did send an improved QP as PR. #128 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Thank you for your tests. Could you please, share the error, or is it the same? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

The build you made probably does not use the file I pull requested since the new version does not contain the line with "Error in cholesky decomposition" message anymore. Please, try to build with the file from PR - https://github.com/pnlbwh/ukftractography/pull/128