junjypark / CLEAN

R package for CLEAN, CLEAN-R, and CLEAN-V
MIT License
4 stars 3 forks source link

Q about distance matrix and surfaces #1

Open cooperr11 opened 1 year ago

cooperr11 commented 1 year ago

Hi there, I'm very new to neuroimaging analysis - I apologize for the silly questions. I've conducted a spin test (i.e. Alexander-Bloch and colleagues) on surface-based neuroimaging data. I'm looking to localize the regions where there is the highest correspondence across the brain maps, which is where CLEAN-R looks super useful. I'm not sure how to: a) read in my data (these are in surface-based format, .mgh - does it need to be read in as xifti format using ciftitools?) b) organize the data matrix (what should this look like?) b) calculate geodesic distances (between which points?).

Thank you kindly for your help. Best, Bec

junjypark commented 1 year ago

Hi Bec,

Thanks for your interest in CLEAN-R. Let me clarify your questions.

  1. CLEAN and CLEAN-R do not require any particular format (e.g. xifti) of the data as long as you have vertex-level neuroimaging data obtained for every subject. If you use FreeSurfer, you may extract each subject's imaging modality and make it a matrix format (e.g. X: a V times N matrix for the first modality, Y: a V times N matrix for the second modality, where V is the number of vertices and N is the number of subjects). The row and column indices should match between X and Y.
    • In R, you can load the freesurferformats package to read mgh files using read.fs.mgh() function.
    • An easy way to obtain X and Y, which we used in our study, is to name the file in order, such as
Xnames=c("X_subj1.mgh", "X_subj2.mgh", ..., "X_subj20.mgh")
Ynames=c("Y_subj1.mgh", "Y_subj2.mgh", ..., "Y_subj20.mgh")

Then, you may use the following to obtain the data in the format for CLEAN-R.

library(freesurferformats)
Xmat=do.call("cbind", lapply(Xnames, read.fs.mgh))
Ymat=do.call("cbind", lapply(Ynames, read.fs.mgh))
  1. Computing Geodesic distances is an important step that I believe is not supported well by R software.

    • We currently use PyGeodesic in Python to compute geodesic matrix.
    • The input should be available within FreeSurfer based on the surface you are using. For example, if you use pial surface, the file should be available as pial.lh and pial.rh.
    • After you obtain Geodesic distances, we recommend removing non-cortical vertices to make sure these vertices are removed from both the distance matrix and the data matrices (X and Y). The indices can be found in Freesurfer as cortex.lh and cortex.rh.
    • FYI: you can read these surface files in R by using read.fs.surface() in R (freesurferformats)
  2. Lastly, if you have subject-level covariates including age or sex, these can be adjusted in CLEAN and CLEAN-R.

  3. We comment that the spin test and CLEAN-R are slightly different from their hypotheses. Therefore, we recommend running CLEAN-R to test and localize coupling regions at the same time.

Thanks again for your interest and please let us know anytime if you have any follow-up issues!

Jun

cooperr11 commented 1 year ago

Hi Jun, Thank you so much for your reply! This is extremely helpful. I'm familiar with the freesurferformats package so that's really easy. Thank you also for adding the code. I truly appreciate your helpful and comprehensive answer!

Best, Bec

junjypark commented 1 year ago

Hi Bec,

Glad to hear that! Please let me know if any issues come up with the implementation!

Jun

cooperr11 commented 1 year ago

Hi @junjypark,

I apologise, I have more questions! I notice that the Pygeodesic code shows you how to calculate distances between source and target vertices in a single mesh - but I understand we want to create a pairwise distance matrix. How would I do this using the Pygeodesic package?

With thanks, Bec

junjypark commented 1 year ago

Hello, In Python you may manually generate a V times V matrix and fill in each row. I am also sharing Python codes that might help.

num_vts = len(vts)
dismat = np.zeros((num_vts, num_vts))
target_indices = np.array(range(num_vts))
for i in range(num_vts):
    dists, best_source = geoalg.geodesicDistances(np.array([i]), None)
    dismat[i, :] = dists
cooperr11 commented 1 year ago

Hi @junjypark,

Thanks so much! This helps a lot. Sorry to continually ask more questions! If I may ask one last question: How did you extract the faces and vertices from your surface files? Stackoverflow is not particularly helpful here.

Thank you kindly for your time.

Warm regards, Bec

junjypark commented 1 year ago

Hi Bec,

No worries, this information is directly available from your FS surface (e.g. lh.pial). For example,

library(freesurferformats)
surf = read.fs.surface("Downloads/lh.pial")
surf$vertices
surf$faces

Please note that, when you use the code above, indices for faces in R begin with 1, ..., V (not from 0, if it matters).

Choice of surfaces can be critical, as you already noticed, because a misspecified surface would give you a distorted geodesic distance. I used pial surface because I thought it would be most representative. Please see the paper below that points out a related issue.

Spencer, D., Yue, Y. R., Bolin, D., Ryan, S., & Mejia, A. F. (2022). Spatial Bayesian GLM on the cortical surface produces reliable task activations in individuals and groups. NeuroImage, 249, 118908.