Clusterwise inference leveraging spatial autocorrelation in neuroimaging
R package to apply CLEAN, CLEAN-R, and CLEAN-V to neuroimaging data.
CLEAN/CLEAN-R/CLEANV supports fast and powerful group-level clusterwise inference for neuroimaging data registered in the cortical surface. The current implementation is computationally efficient. Using a laptop without parallel computing, it takes only a few minutes (mostly less than 10 minutes) to analyze 50 subjects' imaging data across 10,000 vertices. Note that the package also supports parallel computing.
Please visit R, Python, or FreeSurfer for processing cortical surface data. This package is written in R and works well with the ciftiTools or freesurferformats R package.
As of May 20, 2023, The 0.4.0 version of the R package has major changes. Importantly, a series of work that needed to be handled by researchers are simplified, and it now supports Clean
, CleanR
and CleanV
functions that conduct analysis in one step. Documentations are also available once you load the package.
To install the latest development builds directly from GitHub, please run the followings:
if (!require("devtools"))
install.packages("devtools")
devtools::install_github("junjypark/CLEAN")
After installation, the package can be loaded directly in R.
library(CLEAN)
For version 0.4.1 of the package, we support documentation for the key R functions for implementing CLEAN/CLEAN-R/CLEAN-V.
help("Clean")
help("CleanR")
help("CleanV")
How do I extract surface data from HCP?
Please refer ciftiTools. Once you install Connectome Workbench in your computer and obtain data files in cifti format and surface information in the surf.gii
format, then
library(ciftiTools)
ciftiTools.setOption("wb_path", "/Applications/workbench")
xii = read_cifti(fname, surfL, surfR, resamp_res = 10242) #resamp_res: how many vertices to resample
Then you can access the cortical data by the following.
xii$data$cortex_left
xii$data$cortex_right
The corresponding mesh information can be assessed by the following.
xii$surf$cortex_left
xii$surf$cortex_right
Which surface should we use for registration?
We use a spherical surface as a default and use an inflated or midthickness surface for visualization (reference). However, please note that there is no definitive answer for this, and there are recent articles that supported the midthickness surface for registration. Whenever possible, we recommend conducting an exploratory analysis to make sure the parametric kernel agrees with empirical data.
How do I obtain a pairwise distance matrix?
We recommend using geodesic distance for mesh surfaces. You may use Python or C++ to obtain a pairwise geodesic distance matrix.
It requires the extraction of vertices
and faces
matrices. The vertices
matrix contains the 3D coordinate for each vertex, and each row of the faces
matrix contains the indices of three vertices that construct a triangle in the mesh surface.
If you use ciftiTools, it can be accessed directly from the surface GIFTI. If you use freesurferformats, the surface file (e.g. lh.pial
or lh.inflated
from FreeSurfer) can be loaded directly using the read.fs.surface()
function. Once you load vertices
and `faces', the following manipulation is necessary due to the difference between R and Python.
surf$faces = surf$faces-1
Once you loaded vertices
and faces
in Python, the following would provide a pairwise distance matrix.
num_vts = len(vertices)
distmat = np.zeros((num_vts, num_vts))
target_indices = np.array(range(num_vts))
for i in range(num_vts):
dists, best_source = geoalg.geodesicDistances(np.array([i]), None)
distmat[i, :] = dists
The last step is to subset the distance matrices with your interest, for example, by excluding non-cortex vertices (e.g. a medial wall), which should be straightforward.
Is it possible to fit CLEAN/CLEAN-R/CLEAN-V separately for two hemispheres and combine results afterwards?
Yes, it is necessary to set a brain-wise threshold that controls FWER at the nominal level. Please make sure you specify the same seed (seed
) and the same number of resamples (nperm
) for the Clean()
function. Then you may use the combine()
function to get a new threshold.
Clean.fit.lh = Clean(dataLH, distmatLH, nperm = 5000, seed = 1)
Clean.fit.rh = Clean(dataRH, distmatRH, nperm = 5000, seed = 1)
Clean.fit.combine = combine(lst = list(Clean.fit.lh, Clean.fit.rh), alpha = 0.05, collapse = TRUE)
What is the recommended value for max.radius
?
The max.radius
determines the degree of the spatial domain you're borrowing information from. Similar to the optimal smoothing levels in imaging preprocessing, the optimal max.radius
must be determined a priori based on the purpose of research. For example, higher sensitivity obtained from a large value of max.radius
comes with the cost of decreased specificity. Therefore, if increasing specificity is an important issue, then using the default value (20mm) is not recommended and one should consider reducing it. We empirically found values between 10 and 20 useful for interpretation. Please report these values in your article.
How do I visualize the CLEAN/CLEAN-R/CLEAN-V outputs?
For these outputs, Tstat
provides cluster-enhanced test statistics that spatial autocorrelation of imaging data are modeled. Tstat_thresholded
provides thresholded test statistics where non-significant vertices take the value 0. These vectors can be used for visualizations. For example, the freesurferformats package supports writing and saving FreeSurfer imaging data link. The ciftiTools package supports writing and saving in cifti format link.