nilearn / nilearn

Machine learning for NeuroImaging in Python
http://nilearn.github.io
Other
1.17k stars 598 forks source link

Parcellations Nilearn #2310

Closed om35 closed 4 years ago

om35 commented 4 years ago

Hi! If i have a dataset of 260 nifti image of shape (182,218,182) and i want to do a randomized parcellation : My code :

from sklearn.utils import resample
x = ['az.nii.gz',aze.nii.gz' , 'er.nii.gz' , 'aq.nii.gz' ..... ]
for i in range (100):
        resampled = resample(x)    #permutation to change order
        ward=Parcellations(method='ward', n_parcels=1000, random_state=2, mask=None, smoothing_fwhm=4.0, standardize=False, detrend=False, low_pass=None, high_pass=None, t_r=None, target_affine=None, target_shape=None, mask_strategy='epi', mask_args=None, scaling=False, n_iter=10, memory_level=0, n_jobs=1, verbose=1)  
        b=np.array((ward.fit_transform(resampled))) 

First of all, My code is it correct to have a randomized parcellation ? Secondly , When i do a resample (to change the order) and applying a ward, everytime i have a different distribution why ? changing the order involved the change of ward output ?

for example : for i = 0 ward algorithm return : az.nii.gz’ : 0.03 0.56 0.78 aze.nii.gz : 0.78 0.12 0.098

and for i=1 ‘aq.nii.gz’ : 0.009 0.983 1.24 aze.nii.gz : 0.68 0.19 0.17

SO , we observe that for aze.nii.gz , when we applied the resample methode and applying ward , the distribution of signal average per parcel change, Why ? Why the resample method involved this change ?

and thank you very much !

bthirion commented 4 years ago

No the loop you propose won't create randomized parcellations. For that you need to

b=np.array((ward.fit_transform(resampled[:10]))) 

so that the algorithm uses only the first 10 (random images). Could be another number than 10, maybe 50.

Ward does not depend on the order. What matters is that the selected input is different each time. HTH

om35 commented 4 years ago

Thank you very much ! But to have a 1000 randomized parcellation , we must apply this 1000 times ? So we need a loop ? I think that

     "b=np.array((ward.fit_transform(resampled[:76])))" : 

Will create just one randomized parcellation but i must do 1000 randomized parcellation . how i can do this please ?

Thank you very much.

Le mar. 18 févr. 2020 à 23:11, bthirion notifications@github.com a écrit :

No the loop you propose won't create randomized parcellations. For that you need to

b=np.array((ward.fit_transform(resampled[:10])))

so that the algorithm uses only the first 10 (random images). Could be another number than 10, maybe 50.

Ward does not depend on the order. What matters is that the selected input is different each time. HTH

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nilearn/nilearn/issues/2310?email_source=notifications&email_token=AK44PM4SB62M375TXIBOC7LRDRMKNA5CNFSM4KW65JEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMFRM4Y#issuecomment-587929203, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK44PM7EYZ333OQYSKFZUGLRDRMKNANCNFSM4KW65JEA .

bthirion commented 4 years ago

to get 1000 such parcellations, simply use the loop as you did in your example

b = []
for i in range(1000):
        resampled = resample(x)    #permutation to change order
        ward=Parcellations(method='ward', n_parcels=1000, random_state=2, mask=None, smoothing_fwhm=4.0, standardize=False, detrend=False, low_pass=None, high_pass=None, t_r=None, target_affine=None, target_shape=None, mask_strategy='epi', mask_args=None, scaling=False, n_iter=10, memory_level=0, n_jobs=1, verbose=1)  
        b.append(np.array((ward.fit_transform(resampled[:10]))))

b will be a list of average-value-per parcel, for all parcellations

Actually I just realize that Parcellations has a random_state parameter, so you may simply use

b = []
for i in range(1000):
        ward=Parcellations(method='ward', n_parcels=1000, random_state=i, mask=None, smoothing_fwhm=4.0, standardize=False, detrend=False, low_pass=None, high_pass=None, t_r=None, target_affine=None, target_shape=None, mask_strategy='epi', mask_args=None, scaling=False, n_iter=10, memory_level=0, n_jobs=1, verbose=1)  
        b.append(np.array((ward.fit_transform(x))))
om35 commented 4 years ago

Thank you very much ! x =['aa.nii ','bb.nii' ,'gg.nii' ...............................] # we have 262 subjects

So, when i run the first method (with resample) with 100 randomized parcellations (each parcellation with 1000 parcels) and 262 subjects , i have a problem on memory , when i reduce the number of subjects to 50 , it takes 6 hour to do this ! but i will do this with the 262 subjects :/ how can i do this please to perform the computational time . thank you very much mr @bthirion

om35 commented 4 years ago

`[Parcellations] computing rena Traceback (most recent call last):

File "", line 121, in R_P = np.array(Randomized_Parcellation(100,1000)

File "", line 110, in Randomized_Parcellation rena_fit =rena.fit_transform(list_resample[: n_subject])

File "C:\Users\moham\Anaconda3\lib\site-packages\nilearn\regions\parcellations.py", line 466, in fit_transform return self.fit(imgs, confounds=confounds).transform(imgs, confounds)

File "C:\Users\moham\Anaconda3\lib\site-packages\nilearn\decomposition\base.py", line 412, in fit self._raw_fit(data)

File "C:\Users\moham\Anaconda3\lib\site-packages\nilearn\regions\parcellations.py", line 349, in _raw_fit rena, method)

File "C:\Users\moham\Anaconda3\lib\site-packages\joblib\memory.py", line 355, in call return self.func(*args, **kwargs)

File "C:\Users\moham\Anaconda3\lib\site-packages\nilearn\regions\parcellations.py", line 47, in _estimator_fit rena.fit(data)

File "C:\Users\moham\Anaconda3\lib\site-packages\nilearn\regions\rena_clustering.py", line 516, in fit verbose=self.verbose)

File "C:\Users\moham\Anaconda3\lib\site-packages\joblib\memory.py", line 355, in call return self.func(*args, **kwargs)

File "C:\Users\moham\Anaconda3\lib\site-packages\nilearn\regions\rena_clustering.py", line 379, in recursive_neighbor_agglomeration connectivity = weighted_connectivity_graph(X, mask_img)

File "C:\Users\moham\Anaconda3\lib\site-packages\nilearn\regions\rena_clustering.py", line 163, in weighted_connectivity_graph edges, weight = _make_edges_and_weights(X, mask_img)

File "C:\Users\moham\Anaconda3\lib\site-packages\nilearn\regions\rena_clustering.py", line 128, in _make_edges_and_weights weights_unmasked = _compute_weights(X, mask_img)

File "C:\Users\moham\Anaconda3\lib\site-packages\nilearn\regions\rena_clustering.py", line 53, in _compute_weights weights_deep = np.sum(np.diff(data, axis=2) ** 2, axis=-1).ravel()

File "C:\Users\moham\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 1273, in diff a = op(a[slice1], a[slice2])

MemoryError`

bthirion commented 4 years ago

how can i do this please to perform the computational time .

Use a parallel machine with many cores, as the main loop is embarrassingly parallel (use joblib.parallel https://joblib.readthedocs.io/en/latest/parallel.html)

bthirion commented 4 years ago

Also, to avoid memory errors, do not store everything in memory, save the results to disk (an elegant want to achieve that is to use joblib.cache, https://joblib.readthedocs.io/en/latest/memory.html)

thomasbazeille commented 4 years ago

Can be closed, the right place for this type of question would rather be neurostars with the tag nilearn.