Discussion - Githubissues

smazurchuk commented 12 months ago

Hi All! Thanks for this wonderful toolbox and documentation!

I've known about glmdenoise for a while and have periodically thought about using it on our data, and seeing this toolbox as a drop in replacement, I decided to try it out!

I thought I'd share the result for one participant? Feel free to close this as an issue if it is out of scope. This isn't so much an issue, but more to suggest a future idea.

We currently use AFNI for post processing, and specifically use 3dREMLfit for the regression. Our current experimental paradigm has 300 words, with each word presented 6 times to each participant (i.e. 1800 trials total). We then do searchlight RSA on the surface. I used the averaged outputs, and I was curious if the RSA results would improve. Interestingly, while the pattern shifted a bit, the results were quite a bit worse. I am not sure why, and would be happy to share code/data to replicate figures!

Outputs

REMLfit results:

glmsingle results:

Thoughts

As you can see, the color bars are quite a bit different (the unit is Pearson correlation between a semantic model RDM and neural RDM). In the past, we have generally had pretty good results with REMLfit and I was wondering if you/other groups have found large gains using auto regressive models? Is it possible to have penalized auto regressive models?

Thanks again for all the work on this toolbox!

kendrickkay commented 12 months ago

Hi Stephen, Interesting observations. If I understand correctly, you are testing two different methods to obtain the 300 (trial-averaged) beta maps... one is based on GLMsingle and the other is based on 3dREMLfit. In both cases, you get the betas and then construct 300x300 RDM matrices (at a given point in the brain) and then are correlating that with some model RDM.

The patterns being substantially differ across the methods is interesting/unexpected/substantial. Regarding the magnitude differences (-.15 to .15 vs. -.03 to .03) that is also very surprising.

I am not entirely sure as to what is causing the difference (as I am not very familiar with 3dREMLfit). In theory, many possibilities could be responsible (e.g. differences in the HRF, differences in extra "nuisance regressors", differences in how the noise is modeled, etc.). However, one thing that appears is that the maps from the AFNI method are smoother. Hence, this leads me to a suspicion that some amount of spatial smoothing is being enforced. Could that be the case? If you were to pre-spatially smooth the time-series data before analysis with GLMsingle, you might see the magnitude differences generally lessen (or disappear)?

Kendrick

smazurchuk commented 12 months ago

Hi Kendrick! Thanks for the quick and thoughtful reply! :)

If I understand correctly ...

Yep! You got it, just replicating the analysis with a new method. I find the results surprising as well. They are similar enough that I don't think I made a glaring mistake when applying the new method, but different enough to have pause.

I don't know what is causing the difference either. With REMLfit I use the 'SPMG1' HRF and have 12 motion regressors, and CSF and WM regressors (14 total) plus censoring FD > .9. The results look smoother for AFNI but I can't imagine a smoothing difference being introduced as, for both cases, the regressions are being done directly on the surface time series using the same files. For RSA, I normally don't do any smoothing, although there is a caveat that fmriprep does some implicit smoothing to project the volume data to the surface.

One factor that crossed my mind is I also have an RT regressor when I regress all 6 time presentations at once. For REMLfit, I have also generated single trial beta's (that don't have an RT regressor). I did RSA with the average of the single trials, and this reduced the magnitude of the RSA correlation, but it has an almost identical pattern.

One other potential factor that crosses my mind is that because the experimental paradigm is a rapid event-related design, the trials are spaced an average of 4 second apart (with a variable ISI of 1 to 3 seconds). As a result, I used '4' as the stimdur value. However, the SPMG1 HRF has a peak around 6 seconds. The glmsingle documentation says: For example, 3.5 means that you expect the neural activity from a given trial to lat for 3.5 s. I am not sure how stimdur is being used, but I'd say I would expect the neural activity for a given trial to be around 6, even though the trials are spaced around 4 seconds apart.

Thanks again for the help/reply!

kendrickkay commented 12 months ago

Hmm... interesting. A few more thoughts:

I doubt that the following are responsible for the gross differences: SPMG1 HRF; motion/CSF/WM regressors; censoring; extra RT regressor; stimdur
The fact fmriprep is putting things on the surface is fine. I guess both of your methods start with surface time series, so that is held constant.
The stimdur is used to generate some potential HRFs. (The value doesn't directly correspond to the peak of the HRF, so no worries there)...

So, I guess, as far as I can tell, I am a little baffled as to why the major differences exist.

Which doesn't mean that this stays a mystery.

Basically, if you want to dig deeper, it should always be possible.. One can dig on the GLMsingle side, or one can dig on the AFNI side, or both.

If you are using the MATLAB version of GLMsingle, there are a lot of diagnostic figure outputs that can help shed light on what may be going on. It could certainly be the case that something is "wrong" on GLMsingle's side.

As for reverse engineering what is going on on the AFNI side, I am not sure how easy/hard that is.

In theory, you could systematically remove/vary each of the options/knobs on both of the methods to try to get a sense of what could possibly be responsible. But it depends on whether you have the time/motivation to do that.

I would be happy to take a deeper look / discuss if you like, at least on the GLMsingle side of things, if you want to Zoom (https://kendrickkay.youcanbook.me)

Kendrick

kendrickkay commented 12 months ago

Oh, also, an addendum. I do think that all else being equal, and if the analysis is as you described, that higher values for a "pure model RDM" correlation to "brain-derived RDM" is a sign of a better analysis. One relatively easy thing to do (if you are curious about exploring/troubleshooting) is to split your dataset into two parts, say, and then try to extract results from each part and then do a simple reliability analysis.

(Also, I assume that the figures you were showing reflect a single subject; group-average results are a whole another ballgame with their own nuances...)

cvnlab / GLMsingle

Discussion #97

Outputs

Thoughts