Sparsity and HPAI - Githubissues

atsuch commented 8 years ago

We were originally looking at sparsity, defined as the # of voxels above an arbitrary threshold, to compare WB vs hemi components. If there is no asymmetric activity, contrast in WB and hemi components should be similar. In other words, if we compare R hemi in WB components vs R components, the # of voxels above a given threshold should be similar between the two. Increased contrast (i.e. higher # of voxels) in hemi components suggests 'masking' of lateralized activity in WB analysis.

There were two issues with this. 1) We have to normalize the component maps to make scales comparable across component maps. This was suggested by Tomoki and we implemented this. Good. But 2) (also suggested by Tomoki) # of voxel count isn't a conventional way to measure sparsity, and it's affected by the choice of threshold. He suggested to use L1 norm instead.

So I was implementing this. But actually looking at the distribution of values in normalized images in ICA components, I started to think that the l1 sparsity and voxel count sparsity give different information, and it's relevant which one I use, since I use sparsity to look at two other aspects of our data.

One was the difference in sparsity on positive and negative side: although signs in ICA components are meaningless, some components have mostly activation on one side, while others have activation on both side more equally. In terms of brain, this can signify anti-correlated network, where activation in one region is coupled with deactivation in another area. This I think is an interesting aspect of our analysis using neurovault where we have whole-brain map as our input, in contrast with the similar activation meta-analysis based on the peak activation coordinates, like in BrainMap.

Second was what I called HPI, but maybe we should name it hemispheric participation asymmetry index, since it's really an AI. It's the difference in activation patterns in the two hemisphere (for WB analysis), and I was calculating it as (R-L)/(R+L) using the voxel count method. Because I was interested in HPI of anti-correlated network for those components with mixed positive and negative activations, I was calculating this separately for pos, neg , and abs sparsity.

To show you how two methods of sparsity gives a different picture, especially for detecting components with anti-correlated network, here I show the value distribution of one typical WB component image (n_components=20).

sparsity_hist

You can see that, like in a typical stat map used as the input image, it has a long tail on the positive side. If I use either 90 or 95 percentile threshold based on the absolute values of all of the 20 ICA map images (this histogram is just showing one out of 20), the voxel count is clearly larger in positive than negative side. But l1 sparsity for positive and negative values are almost identical, and this was true for all the other component images. The voxel count, on the other hand, showed varying degree of asymmetry (i.e we would see more warm-color voxels than cool-color voxels when plotting this component).

I think for the purpose of detecting components with anti-correlated networks, the voxel count is better, since l1 adds up values of the large number of voxels around zero, the stuff we don't really care about. I'm inclined to use the voxel count for HPAI as well. As I'm writing this, I realized that this is identical to the problems associated with Laterality Index calculation on a task-based fMRI study. Traditionally people used voxel counts asymmetry on the two hemisphere for a given statistical threshold. But others pointed out how it was affected by the choice of the threshold, and suggested more threshold-free methods... I have to find the paper describing the threshold-free calculation of LI, but for the time being I'm thinking of sticking to our original method of counting voxels with an arbitrary threshold... To compare sparsity/HPAI across different # of n_components, we should pick the same threshold across them. Looking at this distribution, 0.000005 seems OK, but maybe I can use 90 or 95 percentile of all the absolute values across all the component images (components = 5, 10, 15, 20... ).

bcipolli commented 8 years ago

Agree ! Not sure that 90/95 percentile is the right threshold (given the need for multiple corrections). Perhaps KL divergence (i.e. comparison between distributions) or some similar value..

bcipolli commented 8 years ago

Fixed by @atsuch

guruucsd / lateralized-components

Sparsity and HPAI #44