WIP: New r lmatch - Githubissues

atsuch commented 8 years ago

It does more or less what I intended to do, but there is still some issues to be fixed. I will have to clean up the output as well..., but please take a look.

atsuch commented 8 years ago

@bcipolli, I made more modifications to control sign flipping behaviour when comparing components. The codes may not be as pretty as it could be but I'm satisfied with what it's doing. Now I will work on ps.py so that it'll show me something meaningful...

A few things to note:

I used the new feature in the nilearn, math_img to do sign flipping, but it can probably be used elsewhere to simplify codes. Eg. join_bilateral_rois could be replaced, but I left it as is for now.
The current version of main.py never calls plot_component_comparisons with keys=('R','L') so I left it, but as it is I don't think it properly handles the title by combining terms for R and L like it does in the mix_and_match_bilateral_components.

bcipolli commented 8 years ago

Thanks @atsuch ! Hope to get to this tomorrow morning... things are a lil' crazy these days :(

bcipolli commented 8 years ago

OK, I was able to run python main.py this morning. Some comments:

Matches look great! Nice :)
Interestingly, wb gets separate right and left finger components. The same R/L finger components match best to both of those components. Kind of like there's a mirroring thing going on in the wb components.

One potential bug:

I got unmatched_RL_0.png, which is actually labeled wb[17] and RL[17] and is the same image as wb_RL_17.png.

I also did get a component that's almost blank. It's because the scalebar is 10-100x higher than the rest of the components.

atsuch commented 8 years ago

There may be other ways of plotting things..but just experimenting ;P

atsuch commented 8 years ago

I was exploring what may be interesting to plot in ps.py, but maybe I should stop here and reflect on what they all mean and what the next steps should be.

@bcipolli, when you have time please try it out to make sure you get the same results.

1) With my matching method, it seems that certain # components are better than others to match up unilateral to wb or concat to wb. With lowest # components, all the components get matched up but the overall similarity scores are not particularly good. With increasing # of components, more unilateral components get left out, AND the overall scores worsen. For the most part, it looks like R-components have better match with wb than L-components (in particular for 20-40 components).

l1norm_simscores

atsuch commented 8 years ago

I expected the worsening of matching process with increasing # components, partly b/c more wb components split up to become more lateralized at higher dimensions. I tried to show how the balance of R and L hemi contribution for each component changes with this plot. wb_hpi It's not necessarily the most informative way of visualizing it... but you can kind of see that while many wb components are still bilateral at relatively high dimensions, more and more components become R or L- lateralized. Trying to match R- and L- components through these lateralized wb components maybe doesn't make sense.

atsuch commented 8 years ago

2) This is not directly related to the matching process, but just out of curiosity I plotted the sparsity of components to see how it changes with ICA dimensionality. Perhaps naively I had assumed that it would generally decrease with increasing dimensionality, as regions contributing to a specific component get split up and thus the total # of voxels contributing to the component for any given threshold becomes less and less for higher # of decomposition. But maybe the algorithm adjusts the sparsity so that it stays more or less constant across different # of decomposition? It would explain why we get brighter signals for unilateral components... Despite having only half # of voxels compared to wb, similar amount of voxels contributes to each component, so more voxels in unilateral components get higher values...does it make sense?

Maybe counting the # of clusters contributing to a component is more meaningful? sparsity

atsuch commented 8 years ago

But with all these sideline analyses...I almost lost track of what I had intended to do with my matching method.

I don't know if you have other ideas on what we should be showing as the result for this project...but an idea I had was to somehow compare terms associated with R- and its L- counterpart component (whether they were matched through spatial similarity, as in your original method, or through common reference wb component, as in my method). If they are symmetrical in their functional properties, neurosynth terms associated with underlying images should be pretty similar to each other and to those associated with corresponding wb component. If decomposing in each hemispheres independently unmasks any functional laterality, there should be corresponding differences in term matrices.

I don't know what the most informative way of comparing terms would be...? Simply subtracting term vectors to find the terms that are most different between matched R and L components? What do you think, @bcipolli ??

bcipolli commented 8 years ago

@atsuch Thanks for all this work! A few suggestions:

For the sparsity graphs, what do you think about applying a threshold to abs(voxel_vals), and testing some intermediate values? It's interesting to see the trend of the line flattening, for me.
For the Asymmetry index plot, it would be cool to see if the variance value changes with # of components. I can see that the spread changes (visually), but with more samples, not sure that the variance computation would be different.
For the intended analysis, I think the neurosynth term analysis could be nice for interpreting differences, but it wouldn't be my comparison metric. Those terms are very noisy, and using them as the metric introduces great uncertainty that's hard to quantify. It seems much safer to quantify differences directly in component voxels, then try to interpret the differences in terms of neurosynth terms.

I will pull this and try it out shortly.

atsuch commented 8 years ago

@bcipolli, Thanks for your quick input!

For the sparsity graphs, what do you think about applying a threshold to abs(voxel_vals), and testing some intermediate values? It's interesting to see the trend of the line flattening, for me.

Sure, so you want to combine positive and negative values? Which intermediate vals are you thinking of? It should be easy to implement.

For the Asymmetry index plot, it would be cool to see if the variance value changes with # of components. I can see that the spread changes (visually), but with more samples, not sure that the variance computation would be different.

Yeah, this graph is the least I'm happy with... Maybe I can just overlay SD on top and that will show what you want to see. This too, I can quickly add in.

For the intended analysis, I think the neurosynth term analysis could be nice for interpreting differences, but it wouldn't be my comparison metric. Those terms are very noisy, and using them as the metric introduces great uncertainty that's hard to quantify. It seems much safer to quantify differences directly in component voxels, then try to interpret the differences in terms of neurosynth terms.

You're right. That seems more straightforward. So to implement it, I guess I can try to add a new plotting function that takes a pair of images, get the difference map, and then fetch neurosynth terms of the difference map and return both the image and terms?

Should I keep working on the same branch to implement them?

bcipolli commented 8 years ago

We should merge this stuff. I'll review the code today. The top two suggestions would be great. I don't care about the intermediate values, just to show the trend from sloped to flat.

So to implement it, I guess I can try to add a new plotting function that takes a pair of images, get the difference map, and then fetch neurosynth terms of the difference map and return both the image and terms?

To compare, I would do correlation, dot product, or other similarity metric. In fact, we could remap individual maps into the component space, and do similarity that way. Actually, I like that.

I would do the terms just as you're doing them now, to compare. I'm not sure if doing neurosynth on subtraction is meaningful; you might try hitting someone from there up about it!

atsuch commented 8 years ago

@bcipolli, I'm not sure if this is what you had in mind, but I made modifications for both sparsity and hpi graphs.

1) Sparsity

I added abs_005, which is abs(voxel_vals)>0.005, so in effect it's the sum of pos_005 and neg_005. I kept pos_005/neg_005 b/c a large neg_005 for a component suggests the presence of anti-correlated network and it might be interesting to look at them.
I got rid of the higher threshold value of 0.001 for clarity.
I also added individual points for abs_005 to see the actual distribution for each component.
For the spread, I was using sem but stdev is more appropriate here, so the shaded regions show mean+/-std.

sparsity

atsuch commented 8 years ago

The lowest points in R and wb components are suspects...they might be the weird component I see, but I have to check. I might have to adjust the thresholding in the plot of individual components so that I can see which components have very different sparsity from other components.

It's interesting to see that unilateral components seem to have more voxels in the negative sign than wb components. I'll check if this is some artefact of our code... I know we apply sign flipping for wb components so that the side containing max value gets the pos sign. I don't remember if we do the same for unilateral components.

atsuch commented 8 years ago

2) HPI

I added the variance (stdev) as the shaded region.
I also separated pos and neg so that plots of variance is easier to see.

wb_hpi

So it looks like stdev stays more or less constant across different # components. Still, the absolute num of more lateralized components increase with increasing dimensionality. I'm not sure how much that's contributing to the decrease in the overall similarity score when doing R/L match through wb components.

If you just look at the size of each dot, which reflect the sparsity of the component, it's interesting to see that at least some components have significant contribution from the negative signs for all the dimensionality, which I think suggests that they contain anti-correlated networks (i.e. when positive areas are activated, negative areas are always deactivated or vice versa).

atsuch commented 8 years ago

Hey @bcipolli,

I'm still thinking about the intended analysis...and still waiting to see your codes to really understand what you mean to do.

But I still think it's important to incorporate the term difference of the matched R, L, and wb components for each component comparison, in the final result. You mentioned a statistical map of R/L differences but to me spatial difference is hard to interpret without the terms associated with that difference. An example of what I'm thinking is Fig3A of this paper; http://www.pnas.org/content/113/7/1907.full

I know the terms are messy, but how about filtering only the psychological terms (eg. by keeping only those in the cognitive atlas...although I just found a google discussion here https://groups.google.com/forum/#!topic/neurosynthlist/5JZePlIn5RY and Vanessa or Tal doesn't seem to approve of what I was thinking of doing...). Or even without filtering, just showing 5-6 terms shared between all the R,L, wb components, and 5-6 most different terms for R vs L, R vs wb, etc and showing it as a radar chart like in the paper cited above?

atsuch commented 8 years ago

Hi @bcipolli ,

I didn't finish incorporating my new term comparison plot...since I started to modify main.py to incorporate both your original matching method through direct r to l comparisons and my method of combining r and l through wb match...

But here is my test plot for the term comparison, just so you can see... it takes top n and bottom m terms (which you can specify) for each component image (wb, r and l) and plots standardized term vals. If there is no overlap of the top n and bottom m terms for each image, there will be (n+m)x3 terms in the plot. In this example there are substantial overlap. I specified n=10, m=3 so max number of terms would be 39, but only 20 terms are shown in the plot.

wb 0 _r 1 _l 4 _term_comparisons

atsuch commented 8 years ago

Corresponding image is here (R and L components are concatenated).

wb_rl_0

atsuch commented 8 years ago

It's qualitative comparison, of course... but I think it's interesting. In this example, terms of R component seem to align better with WB, and they both differ from L eg. in having high scores for "task" and "parietal". This should be reflected in dissimilarity scores, but we can see the similarity/dissimilarity of each components in terms of terms associated with each component.

atsuch commented 8 years ago

Finally...! Sorry for taking so long, but I finished reorganizing main.py. I will need to update ps.py as well, but basically this should allow all the different comparison methods (your original one, comparing R and L directly and combining versus my new matching through wb, and also forcing versus not forcing one-to-one matching).

I mainly worked with "wb", and I still need to check "rl" and "lr" works as intended. I think compare_components might need some modification in order for "lr" match method to work properly...

bcipolli commented 8 years ago

Thanks @atsuch ! Sorry to keep you up so late, thanks for leading this effort!

bcipolli commented 8 years ago

Was able to run this yesterday, and reviewed the code this morning. At this point, the changes are so substantial and I'm so far away from the previous code that it's hard to assess the changes.

My suggestion is to merge things as-is and start writing up the capabilities (current plots and methods), then going from there!

What do you think, @atsuch ?

atsuch commented 8 years ago

Sounds great!

I'm fixing up ps.py so that it plots mean similarity scores as the function of n components for different matching methods.

atsuch commented 8 years ago

@bcipolli, I pushed it so that you can see it, but I couldn't get it to work... My Python crashes before it completes every matching methods.

But anyhow I'll try to say a few things based on the results we already had from my matching method and send it to you.

bcipolli commented 8 years ago

I will try this out now... hoping we can merge and iterate from a simpler branch soon! :)

bcipolli commented 8 years ago

OK, this ran for me (> 5 hours). There should be some simple ways to cache results (e.g. dump relevant csv files, or pickle the results) so that we can tweak the plots without having to run this for so long.

Let's merge here, and I'll try to work on that soon!

guruucsd / lateralized-components

WIP: New r lmatch #29