rpomponio / neuroHarmonize

Harmonization tools for multi-site neuroimaging analysis. Implemented as a python package. Harmonization of MRI, sMRI, dMRI, fMRI variables with support for NIFTI images. Complements the work in Neuroimage by Pomponio et al. (2019).
https://pypi.org/project/neuroHarmonize/
MIT License
79 stars 28 forks source link

Singular matrix harmonizationLearn #39

Closed hajerkr closed 8 months ago

hajerkr commented 10 months ago

Hello! I am getting this error and I am trying to figure out why but I couldn't, anyone got the same? I doubt this has to do with my niftis because they seem to be ok. It seems to be the design matrix, could it be the covariates? Thanks for any insight!

Traceback (most recent call last):
  File "/rds/general/user/hk3618/home/neuroHaromonize.py", line 89, in <module>
    my_model, nifti_array_adj = nh.harmonizationLearn(nifti_array, train_data)
  File "/rds/general/user/hk3618/home/anaconda3/lib/python3.8/site-packages/neuroHarmonize/harmonizationLearn.py", line 118, in harmonizationLearn
    s_data, stand_mean, var_pooled, B_hat, grand_mean = standardizeAcrossFeatures(
  File "/rds/general/user/hk3618/home/anaconda3/lib/python3.8/site-packages/neuroHarmonize/harmonizationLearn.py", line 181, in standardizeAcrossFeatures
    B_hat = np.dot(np.dot(np.linalg.inv(np.dot(design.T, design)), design.T), X.T)
  File "<__array_function__ internals>", line 180, in inv
  File "/rds/general/user/hk3618/home/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py", line 545, in inv
    ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
  File "/rds/general/user/hk3618/home/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py", line 88, in _raise_linalgerror_singular
    raise LinAlgError("Singular matrix")
numpy.linalg.LinAlgError: Singular matrix
hajerkr commented 10 months ago

@yuhancui @rpomponio

rpomponio commented 10 months ago

My first suggestion would be to check for any missing data. Your nifti images and covariates must all be complete — in other words no np.nan values are allowed.

Secondly, I would check the covariates for linear dependencies. The singularity might arise from a covariate matrix that is rank deficient.

hajerkr commented 9 months ago

That sorted it out thank you very much @rpomponio , it was the missing Age information for some. One other question I am not entirely sure about- Would it make sense to train my model on a subset of NIFTI data (++many sites, ++scanners), and then apply it on all the data afterwards, because I would still need that subset I used to train to be harmonized for later analyses. Thanks for any helpful insight

rpomponio commented 8 months ago

You may train on a subset for various purposes, for example if you have healthy controls mixed with heterogeneous diagnoses, you may train the harmonization on controls and apply the model to all samples. Closing the issue.