Closed mayala7 closed 6 months ago
Hi, my guess from this error trace is that the textgrid files have not been correctly downloaded. Could you check that it has been downloaded? One thing you can do to check this is to run the command git annex whereis examplefile
for example git annex whereis avatar.TextGrid
from within the TextGrid folder. You should see an output that includes a line for your computer you are running this on denoted by [here]
.
Thank you so much for your reply!
This is the output that I get: "(cs1430) marcs@marcos-mbp-6 TextGrids % pwd /Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/data/ds003020/derivative/TextGrids (cs1430) marcs@marcos-mbp-6 TextGrids % git annex whereis avatar.TextGrid whereis avatar.TextGrid (3 copies) 2d584fb4-4599-4c6d-b08e-51eaad43d01a -- [s3-PUBLIC] 7fb1f0aa-4b48-4da2-98ca-6925f50fd075 -- root@d478b603bb6a:/content/drive/MyDrive/serre_lab/deep-fMRI-dataset/data/ds003020 c8ac4a72-0900-4d77-83ff-3b4e30ba2967 -- root@openneuro-prod-dataset-worker-0:/datasets/ds003020
s3-PUBLIC: https://s3.amazonaws.com/openneuro.org/ds003020/derivative/TextGrids/avatar.TextGrid?versionId=Pp0T2GYK3Q4IxtVAEMJ5e17hwIq8P7bA ok"
I didn't get the "[here]". Furthermore, running "python encoding.py --subject UTS03 --feature eng1000" after this didn't work too.
it looks like you haven't successfully downloaded the data which you need to do before you run the models. we do have a function built in for this which you can see described on the README file python load_dataset.py -download_preprocess
you can also do this your self by running git annex get derivative
from the main data directory folder. Just fyi this will download a lot of data so I would make sure you have atleast 100gb of space available.
I hope that helps!
Thanks so much for your help! I'm just a bit confused because I did the download without problems and just checked the "avatar.TextGrid" file and it is exists locally with the following content:
""File type = "ooTextFile" Object class = "TextGrid"
xmin = 0.0124716553288
xmax = 754.248299719
tiers?
But when I run "python encoding/encoding.py --subject UTS03 --feature eng1000" from the correct directory, I still get the "ValueError: File format not recognized" error. I've attached the "avatar.TextGrid" file for reference using this link: https://drive.google.com/file/d/1ItvEsYv5LJxBKAQMVYgZ6VSmsdbWsh5x/view?usp=sharing
Thanks again!
hm where is this downloaded data in relation to the encoding folder/can you walk through how you downloaded? Could you also share your environment information?
I just tried walking through the steps in the readme on a clean install computer and didn't have an issue running the encoding.py file
Here is where the TextGrid files are:
"(cs1430) marcs@marcos-mbp-6 TextGrids % ls adollshouse.TextGrid gpsformylostidentity.TextGrid singlewomanseekingmanwich.TextGrid adventuresinsayingyes.TextGrid hangtime.TextGrid sloth.TextGrid afatherscover.TextGrid haveyoumethimyet.TextGrid souls.TextGrid againstthewind.TextGrid howtodraw.TextGrid stagefright.TextGrid alternateithicatom.TextGrid ifthishaircouldtalk.TextGrid stumblinginthedark.TextGrid avatar.TextGrid inamoment.TextGrid superheroesjustforeachother.TextGrid ... cut short for brevity by me
(cs1430) marcs@marcos-mbp-6 TextGrids % pwd /Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/data/ds003020/derivative/TextGrids"
And for downloading I followed the read me instructions. Also for environment information, I attached a txt file to this message. environment_info.txt
All good! For anyone wondering, please make sure you have enough compute when download data. Thank you for your help
(Note: I assume that my installation of git annex is correct and I have installed all dependencies from requirements.txt)
(cs1430) marcs@marcos-mbp-6 encoding % git annex init "serre_fmri_encoding" init serre_fmri_encoding (recording state in git...)
Remote origin not usable by git-annex; setting annex-ignore
https://github.com/HuthLab/deep-fMRI-dataset.git/config download failed: Not Found (Auto enabling special remote s3-PUBLIC...) ok (recording state in git...)
(cs1430) marcs@marcos-mbp-6 encoding % python encoding.py --subject UTS03 --feature eng1000 Saving encoding model & results to: /Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/results/eng1000/UTS03 Pattern match failed. File format not recognized. Traceback (most recent call last): File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/encoding.py", line 57, in
downsampled_feat = get_feature_space(feature, allstories)
File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/feature_spaces.py", line 178, in get_feature_space
return _FEATURE_CONFIGfeature
File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/feature_spaces.py", line 159, in get_eng1000_vectors
wordseqs = get_story_wordseqs(allstories)
File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/feature_spaces.py", line 15, in get_story_wordseqs
grids = load_textgrids(stories, DATA_DIR)
File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/ridge_utils/stimulus_utils.py", line 13, in load_textgrids
grids[story] = TextGrid(open(grid_path).read())
File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/ridge_utils/textgrid.py", line 153, in init
self.text_type = self._check_type()
File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/ridge_utils/textgrid.py", line 208, in _check_type
raise ValueError("File format not recognized")
ValueError: File format not recognized