It takes too long to fit model.

komabso commented 2 years ago

Hello.

I'm trying to use COMMIT2 to filter false positives in tck files based on MRtrix3.

The problem is that it took too long to fit model. (About 6h 21m each COMMIT 1 & 2)

The cause of the problem I expect is

Are 5 million fibers in tck too many?
Thread I used was only one.

The code I used is below.

import os
import commit
from commit import trk2dictionary

os.system( 'tck2connectome -force -nthreads 0 -assignment_radial_search 2 \
        -out_assignments test2_fibers_assignment.txt \
        dhollander_tractogram_SD_STREAM_100408.tck \
        Schaefer400_7Net_2mm_100408.nii.gz test2_commit2_connectome.csv' )

if not os.path.isdir( 'bundles' ) :
    os.mkdir( 'bundles' )

os.system( 'connectome2tck -force -nthreads 0 -exclusive -files per_edge \
    -keep_self dhollander_tractogram_SD_STREAM_100408.tck test2_fibers_assignment.txt \
        bundles/bundle_' )

C = np.loadtxt( 'test2_commit2_connectome.csv', delimiter=',' ) # NB: change 'delimiter' to suits your needs
CMD = 'tckedit -force -nthreads 0'
for i in range(C.shape[0]):
    for j in range(i,C.shape[0]):
        if C[i,j] > 0 :
            CMD += ' bundles/bundle_%d-%d.tck' %(i+1,j+1)
os.system( CMD + ' dhollander_tractogram_SD_STREAM_100408.tck' )

trk2dictionary.run(
    filename_tractogram = 'dhollander_tractogram_SD_STREAM_100408.tck',
    filename_mask       = 'wm_axis3_bin_125.nii.gz',
    fiber_shift         = 0.5
)

import amico
amico.util.fsl2scheme( 'bvals', 'bvecs', 'DWI.scheme' )

mit = commit.Evaluation('/local_raid1/03_user/sunghyoung/01_project/04_COMMIT', 'test2') # commit.Evaluation(STUDY_PATH, SUBJECT_PATH )

mit.load_data( 'data.nii.gz', 'DWI.scheme' )

mit.set_model( 'StickZeppelinBall' )
d_par       = 1.7E-3             # Parallel diffusivity [mm^2/s]
d_perps_zep = []                 # Perpendicular diffusivity(s) [mm^2/s]
d_isos      = [ 1.7E-3, 3.0E-3 ] # Isotropic diffusivity(s) [mm^2/s]
mit.model.set( d_par, d_perps_zep, d_isos )

mit.generate_kernels( regenerate=True )
mit.load_kernels()

mit.load_dictionary( 'COMMIT' ) #  should match a mask with your DWI data as the same geometry

mit.set_threads(1) # mit.set_threads()
mit.build_operator()

mit.fit( tol_fun=1e-3, max_iter=1000, verbose=False )
mit.save_results( path_suffix="_COMMIT1" )

x_nnls, _, _ = mit.get_coeffs()

C = np.loadtxt( 'test2_commit2_connectome.csv', delimiter=',' )
C = np.triu( C ) # be sure to get only the upper-triangular part of the matrix
group_size = C[C>0].astype(np.int32)

tmp = np.insert( np.cumsum(group_size), 0 , 0)
group_idx = np.array( [np.arange(tmp[i],tmp[i+1]) for i in range(len(tmp)-1)] )

group_w = np.empty_like( group_size, dtype=np.float64 )
for k in range(group_size.size) :
    group_w[k] = np.sqrt(group_size[k]) / ( np.linalg.norm(x_nnls[group_idx[k]]) + 1e-12 )

reg_lambda = 5e-4 # change to suit your needs

prior_on_bundles = commit.solvers.init_regularisation(
    mit,
    regnorms    = [commit.solvers.group_sparsity, commit.solvers.non_negative, commit.solvers.non_negative],
    structureIC = group_idx,
    weightsIC   = group_w,
    lambdas     = [reg_lambda, 0.0, 0.0]
)

mit.fit( tol_fun=1e-3, max_iter=1000, regularisation=prior_on_bundles, verbose=False )
mit.save_results( path_suffix="_COMMIT2" )

In conclusion, I have two questions.

Are there ways to reduce fitting time?
Because I don't want to compare COMMIT1 and COMMIT2, how can I only produce results for COMMIT2? Please comment my code.

I'm sorry if they are stupid questions. However, if you help me, it will be very helpful!

Best, Joseph.

MarioOcampo commented 2 years ago

Dear @komabso

I hope you are doing well, thank you for your questions. The fitting time of COMMIT depends on the number of voxels, diffusion directions, and streamlines. The software allows parallel executions, which is recommended to reduce the fitting time.

For the configuration you are using, 1 thread and 5M streamlines, 6h is a possible time. Do you have restriction about the number of threads you can use? In experiments with HCP data with 5M streamlines, the fitting time we obtained for 1 thread was 09h 31m, and using 6 threads was 02h 18m.

Answering your questions.

To reduce the fitting time, if possible, use more than one thread.
In the example in the wiki, we are not comparing COMMIT1 and COMMIT2, we use COMMIT1 to estimate the group_w (weightsIC) used by COMMIT2.

Let us know if you can use more threads, the improvement you obtain, and if you have more questions.

Best regards Mario

komabso commented 2 years ago

Thank you for your kind answer, @MarioOcampo. 👍

I don't restrict the number of our lab server's threads I can use. However, when I ran your commit2 example with mit.set_threads() , the terminal said it couldn't assign threads (I'm sorry I can't exactly remember the warning comment) and then it shut down. Even four threads had the same warning, so I used only one thread. I guess it's because other researchers in our lab use many threads. On your advice, I need to increase the number of threads when other researchers work less. 🤣

I'm sorry to take your valuable time, but I have 3 more questions! 😄

Is it right to use streamline_weights.txt which is one of the results of COMMIT2 like the code below? tck2connectome dhollander_tractogram_SD_STREAM_100408.tck Schaefer400_7Net_2mm_100408.nii.gz connectome_Schaefer400_COMMIT2_100408.txt -tck_weights_in streamline_weights.txt
In the example in the wiki, we are not comparing COMMIT1 and COMMIT2, we use COMMIT1 to estimate the group_w (weightsIC) used by COMMIT2.

-> Does it mean COMMIT1 is a necessary process to make results of COMMIT2?

Compared to the input data, fit_RMSE.nii is little bit strange. As far as I can see, compared to the frontal lobe and the right side of the figure, the occipital lobe and the left side of the figure are shaded . Is it correct result?

Untitled Untitled (1) Untitled (2)

Thanks to your great work, I can study what I want. I appreciate your kind answer again.

Best, Joseph

MarioOcampo commented 2 years ago

Hello @komabso

Do you know if the server is using a job scheduling system?, maybe you have to specify the number of cores to use to execute the script.

About your questions

streamline_weights.txt contains the contribution for each streamline estimated by COMMIT, that is a correct input for tck_weights_in. The right use depends on the question you are addressing.
COMMIT2 requires weightsIC, we use COMMIT1 to estimate group_w, but you can use information from different sources, e.g., from atlases or population studies, COMMIT2 allows you to promote and penalize bundles based on prior knowledge.
That looks like an artifact, but I need more information to understand, like dictionary_tdi.nii.gz, fit_NRMSE.nii.gz, and estimated compartments.

Best regards Mario

komabso commented 2 years ago

Thanks, @MarioOcampo

I understand your answers about 1. and 2.

There are more information. The name of the images is in the upper left corner. 화면 캡처 2022-05-27 231926 화면 캡처 2022-05-27 232008 화면 캡처 2022-05-27 232024 화면 캡처 2022-05-27 232049 화면 캡처 2022-05-27 232101

Thank you for your answer!

Best, Joseph.

MarioOcampo commented 2 years ago

Hello @komabso I think the artifact is there in fit_NRMSE but on the contrary fit_RMSE, i.e., higher value in fit_NRMSE where low value in fit_RMSE. I do not know why now, did you look the rest of DWI, maybe there is something wrong in data.nii. Let me know if you see something wrong in it or don't.

Best regards Mario

komabso commented 2 years ago

Thank you for your answer, and I'm sorry to make you spend so much time, @MarioOcampo

You said "maybe there is something wrong in data.nii." However, the data was taken directly from the HCP database. 😢

There may be something wrong in the data as you saied, so I want to check it. Could you tell me how to check it?

First, I'll post pictures and information

제목 없음

************************************************
Image name:          "data.nii.gz"
************************************************
  Dimensions:        145 x 174 x 145 x 288
  Voxel size:        1.25 x 1.25 x 1.25 x 1
  Data strides:      [ -1 2 3 4 ]
  Format:            NIfTI-1.1 (GZip compressed)
  Data type:         32 bit float (little endian)
  Intensity scaling: offset = 0, multiplier = 1
  Transform:                    1           0           0         -90
                               -0           1           0        -126
                               -0           0           1         -72
  comments:          FSL5.0

Or could there be a problem in the process of making the white matter mask? I leave the command that I used below.

mrconvert -coord 3 2 5tt.mif 5tt_wm.nii.gz
mrmath 5tt_wm.nii.gz mean -axis 3 wm_axis3.nii.gz
flirt -in wm_axis3.nii.gz -ref wm_axis3.nii.gz -applyisoxfm 1.25 -nosearch -out wm_axis3_125.nii.gz

Best, Joseph

daducci commented 2 years ago

Dear Joseph,

indeed the stripes-artifact that you observe is a bit weird. Do you observe it as well after the first run of COMMIT (I mean, prior of re-running COMMIT2)?

Also, I'd be very much interested in understanding a bit more about the error that you observe concerning the number of threads; could you reproduce the error?

komabso commented 2 years ago

Thanks, @daducci !

If what you're saying is the artifact in the results of COMMIT1, yes. I observed it, too. Here are the pictures.

Or, What you're saying is the results of tutorial by yours data and wiki? If so, no. There are no artifacts from what I see.

And here is the error concerning the number of threads.

Best, Joseph.

MarioOcampo commented 2 years ago

Dear Joseph,

Thank you for sharing the error. This is not related to the server, as I though. There is a problem with the ISO compartment. You can see that in the output file compartment_IC.nii.gz, it is not compatible with the anatomy, but I do not know why this is happening. Can I ask you to share the mrinfo for the mask wm_axis3_125.nii.gz?

Another comment, when load the data, use the parameter b0_thr=10 (mit.load_data( 'data.nii.gz', 'DWI.scheme',b0_thr=10 ))

Best regards Mario

komabso commented 2 years ago

Thanks, @MarioOcampo !

Fortunately (?), if other researcher in our lab use fewer threads, fitting time decreased by 3 hours using only one thread.

Here is the result of mrinfo for the mask wm_axis3_125.nii.gz

$ mrinfo wm_axis3_bin_125.nii.gz 
************************************************
Image name:          "wm_axis3_bin_125.nii.gz"
************************************************
  Dimensions:        119 x 148 x 129
  Voxel size:        1.25 x 1.25 x 1.25
  Data strides:      [ 1 2 3 ]
  Format:            NIfTI-1.1 (GZip compressed)
  Data type:         32 bit float (little endian)
  Intensity scaling: offset = 0, multiplier = 1
  Transform:                    1           0           0       -71.5
                                0           1           0      -108.5
                                0           0           1         -72
  comments:          6.0.3:b862cdd5

And, when I use b0_thr=10 on your advice, it looks better than the former.

Best, Joseph.

MarioOcampo commented 2 years ago

Hello Joseph, the mask is not in the same space as the dwi data.

Could you try to use a mask registered to diffusion space? This misalignment could cause problems assigning the threads for the ISO compartment, so maybe after that, you can use more threads.

Let us know.

Best regards Mario

komabso commented 2 years ago

Thanks, @MarioOcampo

You're right. After registering the mask's space to diffusion space, I use threads as much as possible! The fitting time took less than an hour each.

However, there still seems to be the same problem with the output nii files.

When I create a structural connectivity matrix with the output assignment, is the result reliable?

Best, Joseph.

MarioOcampo commented 2 years ago

Hello Joseph,

good to know that the problem with the threads is solved.

About the other issue, if you still have maps that look like they contain artifacts there is another problem. By changing the mask you should get ISO maps in which it is possible to recognize the WM, which was not the case before. Could you check this?

Best Mario

komabso commented 2 years ago

Thanks, @MarioOcampo !

I was wondering if other subjects would get the same result, so I made a framework and ran COMMIT2 again. But there's a different result.

Are the pictures below the proper results? From what I see at least there are no the previous inconsistencies and artifacts.

Best, Joseph.

MarioOcampo commented 2 years ago

Hello Joseph, this images look fine to me, as you said there are not artifacts.

I'm wondering what happened with the previous subject, did you try to redo all the process from the creation of the dictionary with the mask in diffusion space? Could you try to remove all the files generated with COMMIT and redo the process?

Best Mario

komabso commented 2 years ago

Hello @MarioOcampo .

Yes, I did. When I redo the process, the results are same (without artifacts).

I think I mistook something when I made a white matter mask, however, I don't know what I did wrong. 😞

Anyway thanks to your team (especially @MarioOcampo), I resolved all problems that I went through. I really appreciated you.

Best regards, Joseph.

daducci commented 2 years ago

Hi Joseph,

Would it be possible to send us (via dropbox or other cloud services) the folder with the data/files that create problems with the fit? This way we could debug efficiently. Perhaps there is an uncaught exception or similar, and it'd very helpful to spot it. Thanks, Ale

komabso commented 2 years ago

Hello, @daducci .

Here is the data file with artifacts. If you have something wrong in the data, please let me know.

https://drive.google.com/drive/folders/1cdxxTiiwAcaa7BjJT1aTrZWxtsFBCjPX?usp=sharing

Best, Joseph.

daducci commented 1 year ago

Hi @komabso and @MarioOcampo ,

Any update on this issue? Can we close it?

nightwnvol commented 7 months ago

@daducci, maybe it would be useful to convert this issue into a discussion.

daducci / COMMIT

It takes too long to fit model. #112