Issue with textgrid.py or git annex

mayala7 commented 9 months ago

(Note: I assume that my installation of git annex is correct and I have installed all dependencies from requirements.txt)

(cs1430) marcs@marcos-mbp-6 encoding % git annex init "serre_fmri_encoding" init serre_fmri_encoding (recording state in git...)

Remote origin not usable by git-annex; setting annex-ignore

https://github.com/HuthLab/deep-fMRI-dataset.git/config download failed: Not Found (Auto enabling special remote s3-PUBLIC...) ok (recording state in git...)

(cs1430) marcs@marcos-mbp-6 encoding % python encoding.py --subject UTS03 --feature eng1000 Saving encoding model & results to: /Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/results/eng1000/UTS03 Pattern match failed. File format not recognized. Traceback (most recent call last): File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/encoding.py", line 57, in downsampled_feat = get_feature_space(feature, allstories) File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/feature_spaces.py", line 178, in get_feature_space return _FEATURE_CONFIGfeature File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/feature_spaces.py", line 159, in get_eng1000_vectors wordseqs = get_story_wordseqs(allstories) File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/feature_spaces.py", line 15, in get_story_wordseqs grids = load_textgrids(stories, DATA_DIR) File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/ridge_utils/stimulus_utils.py", line 13, in load_textgrids grids[story] = TextGrid(open(grid_path).read()) File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/ridge_utils/textgrid.py", line 153, in init self.text_type = self._check_type() File "/Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/ridge_utils/textgrid.py", line 208, in _check_type raise ValueError("File format not recognized") ValueError: File format not recognized

alebel14 commented 9 months ago

Hi, my guess from this error trace is that the textgrid files have not been correctly downloaded. Could you check that it has been downloaded? One thing you can do to check this is to run the command git annex whereis examplefile for example git annex whereis avatar.TextGrid from within the TextGrid folder. You should see an output that includes a line for your computer you are running this on denoted by [here].

mayala7 commented 9 months ago

Thank you so much for your reply!

This is the output that I get: "(cs1430) marcs@marcos-mbp-6 TextGrids % pwd /Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/data/ds003020/derivative/TextGrids (cs1430) marcs@marcos-mbp-6 TextGrids % git annex whereis avatar.TextGrid whereis avatar.TextGrid (3 copies) 2d584fb4-4599-4c6d-b08e-51eaad43d01a -- [s3-PUBLIC] 7fb1f0aa-4b48-4da2-98ca-6925f50fd075 -- root@d478b603bb6a:/content/drive/MyDrive/serre_lab/deep-fMRI-dataset/data/ds003020 c8ac4a72-0900-4d77-83ff-3b4e30ba2967 -- root@openneuro-prod-dataset-worker-0:/datasets/ds003020

s3-PUBLIC: https://s3.amazonaws.com/openneuro.org/ds003020/derivative/TextGrids/avatar.TextGrid?versionId=Pp0T2GYK3Q4IxtVAEMJ5e17hwIq8P7bA ok"

I didn't get the "[here]". Furthermore, running "python encoding.py --subject UTS03 --feature eng1000" after this didn't work too.

alebel14 commented 9 months ago

it looks like you haven't successfully downloaded the data which you need to do before you run the models. we do have a function built in for this which you can see described on the README file python load_dataset.py -download_preprocess you can also do this your self by running git annex get derivative from the main data directory folder. Just fyi this will download a lot of data so I would make sure you have atleast 100gb of space available.

I hope that helps!

mayala7 commented 8 months ago

Thanks so much for your help! I'm just a bit confused because I did the download without problems and just checked the "avatar.TextGrid" file and it is exists locally with the following content:

""File type = "ooTextFile" Object class = "TextGrid"

xmin = 0.0124716553288 xmax = 754.248299719 tiers? size = 2 item []: item [1]: class = "IntervalTier" name = "phone" xmin = 0.0124716553288 xmax = 754.248299719 intervals: size = 5171 intervals [1]: xmin = 0.0124716553288 xmax = 3.62426293878 text = "ns" intervals [2]: xmin = 3.62426303855 xmax = 4.91133786848 text = "sp" intervals [3]: xmin = 4.91133786848 xmax = 5.06099773243 text = "T" intervals [4]: xmin = 5.06099773243 xmax = 5.18072562358 text = "AA1" ..."

But when I run "python encoding/encoding.py --subject UTS03 --feature eng1000" from the correct directory, I still get the "ValueError: File format not recognized" error. I've attached the "avatar.TextGrid" file for reference using this link: https://drive.google.com/file/d/1ItvEsYv5LJxBKAQMVYgZ6VSmsdbWsh5x/view?usp=sharing

Thanks again!

alebel14 commented 8 months ago

hm where is this downloaded data in relation to the encoding folder/can you walk through how you downloaded? Could you also share your environment information?

I just tried walking through the steps in the readme on a clean install computer and didn't have an issue running the encoding.py file

mayala7 commented 8 months ago

Here is where the TextGrid files are:

"(cs1430) marcs@marcos-mbp-6 TextGrids % ls adollshouse.TextGrid gpsformylostidentity.TextGrid singlewomanseekingmanwich.TextGrid adventuresinsayingyes.TextGrid hangtime.TextGrid sloth.TextGrid afatherscover.TextGrid haveyoumethimyet.TextGrid souls.TextGrid againstthewind.TextGrid howtodraw.TextGrid stagefright.TextGrid alternateithicatom.TextGrid ifthishaircouldtalk.TextGrid stumblinginthedark.TextGrid avatar.TextGrid inamoment.TextGrid superheroesjustforeachother.TextGrid ... cut short for brevity by me

(cs1430) marcs@marcos-mbp-6 TextGrids % pwd /Users/marcs/Downloads/serre_lab_downloads/combined/serre_lab_complete/deep-fMRI-dataset/encoding/data/ds003020/derivative/TextGrids"

And for downloading I followed the read me instructions. Also for environment information, I attached a txt file to this message. environment_info.txt

mayala7 commented 6 months ago

All good! For anyone wondering, please make sure you have enough compute when download data. Thank you for your help

HuthLab / deep-fMRI-dataset

Issue with textgrid.py or git annex #7