HuthLab / encoding-model-scaling-laws

Repository for the 2023 NeurIPS paper "Scaling laws for language encoding models in fMRI"
21 stars 1 forks source link

Few more questions about reproducing the results on paper #3

Closed dyhan316 closed 4 months ago

dyhan316 commented 4 months ago

@RAntonello

Sorry to bother you again. I have a few more questions about reproduction.

  1. It appears that in order to compute the cc_norm and cc_max for a given story, the response data for all the story's trials is needed (in other words, the average of the trials for a given test story is not sufficient). However, in the box link only the "wheretheressmoke" test story has all the trial's response data.
  2. Are the voxelwise correlation coefficient plotted below only plotting cc_norm for "wheretheressmoke" test story only? image
  3. When calculating the Encoding Performance (r^2) (Fig1c,f), did you 1. average the average r^2 over the three test stories? or 2. concatenate the predicted response and average response of the three stories, then took the r^2 over them?

Thank you in advance for your response!

Originally posted by @dyhan316 in https://github.com/HuthLab/encoding-model-scaling-laws/issues/1#issuecomment-2165935543

RAntonello commented 4 months ago

Hi,

Yes, as we say in the paper, we use "the test story with 10 repeats" to do the noise ceiling analysis. This is because the noise ceiling analysis is more accurate for larger numbers of repeats, and we have only 5 repeats of the other test stories. cc_norm is part of the noise ceiling analysis, so that is also using wheretheressmoke only. For encoding performance, I concatenated the predicted response of the three test stories excluding the first 40 TRs of each to account for the onset effect that we mentioned, and then computed r*|r|. I set use_corr=True when computing the weight vector for these. Hope this helps.

dyhan316 commented 4 months ago

Thank you so much!! :):)