MedARC-AI / MindEyeV2

MIT License
106 stars 18 forks source link

No such file or no access: 'subj01_ncsnr.nii.gz' #28

Closed Boltzmachine closed 1 week ago

Boltzmachine commented 1 month ago

In dataset_creation.ipynb, I got an error

FileNotFoundError                         Traceback (most recent call last)
Cell In[3], line 46
     43 betas = np.moveaxis(betas,-1,0)
     45 vox_include = copy.deepcopy(nsdgeneral_mask)
---> 46 ncsnr = nib.load(f"{subject}_ncsnr.nii.gz").get_fdata()
     47 ncsnr[ncsnr<.15] = np.nan 
     48 if tar==0: print("voxels left:", len(vox_include[vox_include>0]))

File ~/.conda/envs/mindeye/lib/python3.10/site-packages/nibabel/loadsave.py:92, in load(filename, **kwargs)
     90     stat_result = os.stat(filename)
     91 except OSError:
---> 92     raise FileNotFoundError(f"No such file or no access: '{filename}'")
     93 if stat_result.st_size <= 0:
     94     raise ImageFileError(f"Empty file: '{filename}'")

FileNotFoundError: No such file or no access: 'subj01_ncsnr.nii.gz'

But I did not find where to get the file (I only found ncsnr.nii.gz in NSD dataset)

DavisMeee commented 1 month ago

Yes, “ncsnr.nii.gz” are the original files, we renamed them. These files are provided in original NSD dataset.

Boltzmachine commented 1 month ago

So it is the same for all subjects? I see there is only one such file in the original NSD dataset.

PaulScotti commented 1 month ago

There are multiple ncsnr.nii.gz files, one for every subject, in the original NSD dataset. They have the same filename but are in different folders, which is why we renamed them.

e.g.,

subj01: https://natural-scenes-dataset.s3.amazonaws.com/nsddata_betas/ppdata/subj01/func1pt8mm/betas_fithrf_GLMdenoise_RR/ncsnr.nii.gz

subj02: https://natural-scenes-dataset.s3.amazonaws.com/nsddata_betas/ppdata/subj02/func1pt8mm/betas_fithrf_GLMdenoise_RR/ncsnr.nii.gz

See the NSD Data Manual for more information: https://cvnlab.slite.page/p/channel/CPyFRAyDYpxdkPK6YbB5R1/notes/M3ZvPmfgU3

Boltzmachine commented 1 month ago

Thank you. Just need to confirm one more thing I saw the code in the script

  betas = nsda.read_betas(subject=subject, 
                      session_index=sess, 
                      trial_index=[], # empty list as index means get all for this session
                      data_type='betas_fithrf', # GLMSingle beta2
                      data_format='func1pt8mm') 

but the ncsnr.nii.gz is read from the betas_fithrf_GLMdenoise_RR or the betas_fithrf folder?

PaulScotti commented 1 month ago

betas_fithrf_GLMdenoise_RR folder, this code is not exactly the same as the one used for the paper

Boltzmachine commented 1 month ago

I see. Do you have any suggestions if I want to reproduce the preprocessing procedure in the paper?

PaulScotti commented 1 month ago

It should be the same if you do proper z-scoring (z-score using training set only and apply it to the train & test set) and use the func1pt8mm betas_fithrf_GLMdenoise_RR betas

Boltzmachine commented 1 month ago

I see. I changed the corresponding parameters but got an assertion error

AssertionError                            Traceback (most recent call last)
Cell In[6], line 343
    322         behavior = {
    323             "cocoidx": coco73, #0
    324             "subject": sub+1,                          #1
   (...)
    339             "shared1000": shared1000[int(behav.iloc[jj]['73KID'])-1], #16
    340         }
    342         assert (int(behav.iloc[jj]['SESSION'])-1)*750 + jj >= 0
--> 343         assert (int(behav.iloc[jj]['SESSION'])-1)*750 + jj < 27750
    345         olds_behav_matrix[jjj] = np.array(list(behavior.values()))
    347 behav = globals()[f'behav_ses{sess}']

AssertionError: 
PaulScotti commented 1 month ago

you can comment that out or replace it with 30000

27750 refers to the max possible number of trials before an additional 3 sessions of data were released after the Algonauts challenge ended, we used the full data for MindEye2