Closed dyhan316 closed 5 months ago
Hi @dyhan316, we test using the average over all test sessions to increase SNR in our test data. We use three test stories, wheretheressmoke, onapproachtopluto and fromboyhoodtofatherhood, averaging all test repeats that are available for each of these (10,5, and 5). We mention this in section 2.2 of the paper where we say "These test responses were averaged across repetitions." We also trim an additional 40 TRs of test data from the beginning of each story to account for the onset artifact which we mention in the appendix of the paper. Please inform me if you continue to have trouble reproducing anything and I will be more than happy to provide additional guidance.
@RAntonello Thank you for the response! I have just a few more questions if you don't mind!
It appears that two out of the three test stories are unavailable (link where I got the response data ). (only wherehteressmoke response seems to be available)
Also, I just want to make sure : so the r^2 value was obtained by taking the correlation between the prediction and mean test response and not the mean of correlation (across trials) (i.e. correlation with mean or mean of correlation across trials)?
Thank you again!!
You should be able to find the pre-averaged test stories in the full_responses folder of the Box. The separate trials are only used for the noise ceiling analysis in Figure 3. You are correct that it was obtained as the correlation between the prediction and mean test response and not the mean correlation across trials.
@RAntonello Thank you again for your quick response :)
I see! That clears things up a bit. I have one additional question : which stories exactly were used for training/testing?
I may be missing something, but the paper states at 2.2 that , "... each subject listened to roughly 95 different stories, ....".
However, when I load the full_responses file of, say, subject S3, the number of stories is 101.
Could you specify which each stories out of the 101 are used for training? (I first thought that the remaining 98 stories would be used (101-3), but it seems that this is not the case?)
For reference, these are the 101 stories loaded from the full_responses folder!
Thank you!
dict_keys(['itsabox', 'odetostepfather', 'inamoment', 'afearstrippedbare', 'findingmyownrescuer', 'hangtime', 'ifthishaircouldtalk', 'goingthelibertyway', 'golfclubbing', 'thetriangleshirtwaistconnection', 'igrewupinthewestborobaptistchurch', 'tetris', 'becomingindian', 'canplanetearthfeedtenbillionpeoplepart1', 'thetiniestbouquet', 'swimmingwithastronauts', 'lifereimagined', 'forgettingfear', 'stumblinginthedark', 'backsideofthestorm', 'food', 'theclosetthatateeverything', 'escapingfromadirediagnosis', 'notontheusualtour', 'exorcism', 'adventuresinsayingyes', 'thefreedomridersandme', 'cocoonoflove', 'waitingtogo', 'thepostmanalwayscalls', 'googlingstrangersandkentuckybluegrass', 'mayorofthefreaks', 'learninghumanityfromdogs', 'shoppinginchina', 'souls', 'cautioneating', 'comingofageondeathrow', 'breakingupintheageofgoogle', 'gpsformylostidentity', 'marryamanwholoveshismother', 'eyespy', 'treasureisland', 'thesurprisingthingilearnedsailingsoloaroundtheworld', 'theadvancedbeginner', 'goldiethegoldfish', 'life', 'thumbsup', 'seedpotatoesofleningrad', 'theshower', 'adollshouse', 'canplanetearthfeedtenbillionpeoplepart2', 'sloth', 'howtodraw', 'quietfire', 'metsmagic', 'penpal', 'thecurse', 'canadageeseandddp', 'thatthingonmyarm', 'buck', 'thesecrettomarriage', 'wildwomenanddancingqueens', 'againstthewind', 'indianapolis', 'alternateithicatom', 'bluehope', 'kiksuya', 'afatherscover', 'haveyoumethimyet', 'firetestforlove', 'catfishingstrangerstofindmyself', 'christmas1940', 'tildeath', 'lifeanddeathontheoregontrail', 'vixenandtheussr', 'undertheinfluence', 'beneaththemushroomcloud', 'jugglingandjesus', 'superheroesjustforeachother', 'sweetaspie', 'naked', 'singlewomanseekingmanwich', 'avatar', 'whenmothersbullyback', 'myfathershands', 'reachingoutbetweenthebars', 'theinterview', 'stagefright', 'legacy', 'canplanetearthfeedtenbillionpeoplepart3', 'listo', 'gangstersandcookies', 'birthofanation', 'mybackseatviewofagreatromance', 'lawsthatchokecreativity', 'threemonths', 'whyimustspeakoutaboutclimatechange', 'leavingbaghdad', 'wheretheressmoke', 'onapproachtopluto', 'fromboyhoodtofatherhood'])
Yes, we removed a small number of stories for incidental reasons from the training set on a per-subject basis. Here are the lists for each subject. UTS02 and UTS03 have the same lists.
UTS01_train_list = ['itsabox', 'odetostepfather', 'inamoment', 'hangtime', 'ifthishaircouldtalk', 'goingthelibertyway', 'golfclubbing', 'thetriangleshirtwaistconnection', 'igrewupinthewestborobaptistchurch', 'tetris', 'becomingindian', 'canplanetearthfeedtenbillionpeoplepart1', 'thetiniestbouquet', 'swimmingwithastronauts', 'lifereimagined', 'forgettingfear', 'stumblinginthedark', 'backsideofthestorm', 'food', 'theclosetthatateeverything', 'notontheusualtour', 'exorcism', 'adventuresinsayingyes', 'thefreedomridersandme', 'cocoonoflove', 'waitingtogo', 'thepostmanalwayscalls', 'googlingstrangersandkentuckybluegrass', 'mayorofthefreaks', 'learninghumanityfromdogs', 'shoppinginchina', 'souls', 'cautioneating', 'comingofageondeathrow', 'breakingupintheageofgoogle', 'gpsformylostidentity', 'eyespy', 'treasureisland', 'thesurprisingthingilearnedsailingsoloaroundtheworld', 'theadvancedbeginner', 'goldiethegoldfish', 'life', 'thumbsup', 'seedpotatoesofleningrad', 'theshower', 'adollshouse', 'canplanetearthfeedtenbillionpeoplepart2', 'sloth', 'howtodraw', 'quietfire', 'metsmagic', 'penpal', 'thecurse', 'canadageeseandddp', 'thatthingonmyarm', 'buck', 'wildwomenanddancingqueens', 'againstthewind', 'indianapolis', 'alternateithicatom', 'bluehope', 'kiksuya', 'afatherscover', 'haveyoumethimyet', 'firetestforlove', 'catfishingstrangerstofindmyself', 'christmas1940', 'tildeath', 'lifeanddeathontheoregontrail', 'vixenandtheussr', 'undertheinfluence', 'beneaththemushroomcloud', 'jugglingandjesus', 'superheroesjustforeachother', 'sweetaspie', 'naked', 'singlewomanseekingmanwich', 'avatar', 'whenmothersbullyback', 'myfathershands', 'reachingoutbetweenthebars', 'theinterview', 'stagefright', 'legacy', 'canplanetearthfeedtenbillionpeoplepart3', 'listo', 'gangstersandcookies', 'birthofanation', 'mybackseatviewofagreatromance', 'lawsthatchokecreativity', 'threemonths', 'whyimustspeakoutaboutclimatechange', 'leavingbaghdad']
UTS_01_test_list = ['wheretheressmoke', 'onapproachtopluto', 'fromboyhoodtofatherhood']
UTS02_03_train_list = ['itsabox', 'odetostepfather', 'inamoment', 'afearstrippedbare', 'findingmyownrescuer', 'hangtime', 'ifthishaircouldtalk', 'goingthelibertyway', 'golfclubbing', 'thetriangleshirtwaistconnection', 'igrewupinthewestborobaptistchurch', 'tetris', 'becomingindian', 'canplanetearthfeedtenbillionpeoplepart1', 'thetiniestbouquet', 'swimmingwithastronauts', 'lifereimagined', 'forgettingfear', 'stumblinginthedark', 'backsideofthestorm', 'food', 'theclosetthatateeverything', 'escapingfromadirediagnosis', 'notontheusualtour', 'exorcism', 'adventuresinsayingyes', 'thefreedomridersandme', 'cocoonoflove', 'waitingtogo', 'thepostmanalwayscalls', 'googlingstrangersandkentuckybluegrass', 'mayorofthefreaks', 'learninghumanityfromdogs', 'shoppinginchina', 'souls', 'cautioneating', 'comingofageondeathrow', 'breakingupintheageofgoogle', 'gpsformylostidentity', 'marryamanwholoveshismother', 'eyespy', 'treasureisland', 'thesurprisingthingilearnedsailingsoloaroundtheworld', 'theadvancedbeginner', 'goldiethegoldfish', 'life', 'thumbsup', 'seedpotatoesofleningrad', 'theshower', 'adollshouse', 'canplanetearthfeedtenbillionpeoplepart2', 'sloth', 'howtodraw', 'quietfire', 'metsmagic', 'penpal', 'thecurse', 'canadageeseandddp', 'thatthingonmyarm', 'buck', 'thesecrettomarriage', 'wildwomenanddancingqueens', 'againstthewind', 'indianapolis', 'alternateithicatom', 'bluehope', 'kiksuya', 'afatherscover', 'haveyoumethimyet', 'firetestforlove', 'catfishingstrangerstofindmyself', 'christmas1940', 'tildeath', 'lifeanddeathontheoregontrail', 'vixenandtheussr', 'undertheinfluence', 'beneaththemushroomcloud', 'jugglingandjesus', 'superheroesjustforeachother', 'sweetaspie', 'naked', 'singlewomanseekingmanwich', 'avatar', 'whenmothersbullyback', 'myfathershands', 'reachingoutbetweenthebars', 'theinterview', 'stagefright', 'legacy', 'canplanetearthfeedtenbillionpeoplepart3', 'listo', 'gangstersandcookies', 'birthofanation', 'mybackseatviewofagreatromance', 'lawsthatchokecreativity', 'threemonths', 'whyimustspeakoutaboutclimatechange', 'leavingbaghdad'][:int(sys.argv[3])]
UTS_02_03_test_list = ['wheretheressmoke', 'onapproachtopluto', 'fromboyhoodtofatherhood']
Thank you! just one more question : in the UTS02_03_train_list, there's a list slicing[:int(sys.argv[3])]
at the end. what was the sys.argv[3]
value for it?
Ah sorry I copied it from the "number of stories" analysis, just use the full list.
Thank you :)
It seems that the TR, text grids file missing for three stories : ['canplanetearthfeedtenbillionpeoplepart3', 'canplanetearthfeedtenbillionpeoplepart2', 'canplanetearthfeedtenbillionpeoplepart1']
The TR, text grid files ref needed for these stories as these are used during training (as you have stated previously)
(I got the TR, text grids files from : https://utexas.app.box.com/v/EncodingModelScalingLaws/folder/230420528915)
Please see the added wordseqs.jbl file which contains, among other stories, all the stories from the training and test sets preloaded as DataSequences.
Thank you :)
@RAntonello
Sorry to bother you again. I have a few more questions about reproduction.
Thank you in advance for your response!
I am trying to replicate the Figure 1 encoder performance plot on paper but am having difficulty.
I followed the tutorial jupyter notebook (using the 33rd layer of OPT-30B model), and tried to reproduce the results for subject S3. I was able to reproduce the Figure 2 results (voxel-wise r values). However, I was not able to reproduce the "Encoding Performance (Avg r^2)" values of Figure 1. I got values in the range of 0.02, not 0.03 as Figure 1 claims it is.
Below is what I got by using the voxel-wise r value (corrs_unnorm) to get r^2 (|r|*r), averaged over each trial. The values are different from the values in Figure 1.
(Below is Figure 1, for reference)
Could you please explain how I can reproduce the results on paper? My current assumptions are that
Thank you in advance!