zjc062 / mind-vis

Code base for MinD-Vis
https://mind-vis.github.io/
MIT License
749 stars 94 forks source link

Preprocessing scripts to generate the Kamitani .npz files from FigShare #4

Closed roytu closed 1 year ago

roytu commented 1 year ago

Hi, thanks for the very cool paper!

I'm trying to reproduce your results, and I noticed that your FigShare link stores the Kamitani dataset as /data/.../.npz files, which differs from the format that the original GOD repository uses (.h5 files in their FigShare). How did you perform the conversion from their .h5 files to your .npz files? Is the code available?

Thanks!

zjc062 commented 1 year ago

Hi thank you for your interest in our work!

The .npz files we stored on FigShare were downloaded from SelfSuperReconst. The contents of the .h5 and .npz files are exactly the same except the authors in SelfSuperReconst rearranged the orders and aligned them to the order of the images.

Thanks!

roytu commented 1 year ago

Hi, thanks so much for the quick reply! (And sorry for my late one / re-opening this).

I'm still having some trouble understanding how to get from the original nii.gz run files to the .h5 and .npy files (and this is probably due to my general lack of knowledge in the space). So for example, the perceptionTraining01 task for sub01 has the following shapes:

sub-01_ses-perceptionTraining01_task-perception_run-01_bold_preproc.nii.gz: (64, 64, 50, 175)
sub-01_ses-perceptionTraining01_task-perception_run-02_bold_preproc.nii.gz: (64, 64, 50, 175)
sub-01_ses-perceptionTraining01_task-perception_run-03_bold_preproc.nii.gz: (64, 64, 50, 175)
sub-01_ses-perceptionTraining01_task-perception_run-04_bold_preproc.nii.gz: (64, 64, 50, 175)
sub-01_ses-perceptionTraining01_task-perception_run-05_bold_preproc.nii.gz: (64, 64, 50, 175)
sub-01_ses-perceptionTraining01_task-perception_run-06_bold_preproc.nii.gz: (64, 64, 50, 175)
sub-01_ses-perceptionTraining01_task-perception_run-07_bold_preproc.nii.gz: (64, 64, 50, 175)
sub-01_ses-perceptionTraining01_task-perception_run-08_bold_preproc.nii.gz: (64, 64, 50, 175)
sub-01_ses-perceptionTraining01_task-perception_run-09_bold_preproc.nii.gz: (64, 64, 50, 175)
sub-01_ses-perceptionTraining01_task-perception_run-10_bold_preproc.nii.gz: (64, 64, 50, 175)

which I interpret to be 50 slices of a 64x64 BOLD signal over 175 timesteps each. The authors of the GOD paper mention they use fmriprep to preprocess the data (link), which after some analysis yields the .h5 files, which are rearranged to yield the .npz files. The .npz files have the following shapes:

V1: (4466,)
V2: (4466,)
V3: (4466,)
V4: (4466,)
FFA: (4466,)
PPA: (4466,)
LOC: (4466,)
LVC: (4466,)
HVC: (4466,)
VC: (4466,)
arr_0: (1200, 4466)
arr_1: (1750, 4466)
arr_2: (50, 4466)
arr_3: (1200,)
arr_4: (1750,)
arr_5: (4466,)
arr_6: (4466,)

and all the arrays (with the exception of arr_0 - arr_6) are binary arrays. So I guess my initial follow-up question is, what do the binary values in V1-VC represent? (and what about the arr_0 to arr_6 arrays?)

Appreciate your patience!

roytu commented 1 year ago

I think I might've figured it out after staring at this file.

The Kaminati testing dataset results from showing 50 images to subjects 35 times, for a total of 1750 fMRI readings. This is split between 10 runs, and each fMRI scan results in a 64x64x50 voxel grid. This explains the shapes of these files:

sub-01_ses-perceptionTraining01_task-perception_run-01_bold_preproc.nii.gz: (64, 64, 50, 175)
sub-01_ses-perceptionTraining01_task-perception_run-02_bold_preproc.nii.gz: (64, 64, 50, 175)
...
sub-01_ses-perceptionTraining01_task-perception_run-10_bold_preproc.nii.gz: (64, 64, 50, 175)

In SelfSuperReconst, since we only care about data within specific ROIs, we first compute the masks V1-VC over the template brain. The union of these voxels is 4466 total voxels.

These arrays:

V1: (4466,)
V2: (4466,)
V3: (4466,)
V4: (4466,)
FFA: (4466,)
PPA: (4466,)
LOC: (4466,)
LVC: (4466,)
HVC: (4466,)
VC: (4466,)

are True/False arrays that tell you whether Voxel x is part of a mask or not.

The other arrays correspond to the actual fMRI readings (see this line):

arr_0: (1200, 4466)  # Y
arr_1: (1750, 4466)  # Y_test
arr_2: (50, 4466)  # Y_test_avg
arr_3: (1200,)  # labels_train
arr_4: (1750,)  # test_labels
arr_5: (4466,)  # vox_noise_ceil
arr_6: (4466,)  # vox_snr

Let me know if I've got anything wrong here!

zjc062 commented 1 year ago

Hi Roy,

Sorry for the late reply! Forgot to reply back since I reopened the issue last time!

For the shape of the files, the first three dimensions (64, 64, 50) are the shape of an fMRI volume in Talairach coordinates.
However, 175 here is an fMRI scan's time dimension in each scan. Each fMRI scan contains 55 blocks of visual stimulus, including 50 blocks with different images and five randomly interspersed repetition blocks. And each block contains 9s, which correspond to 3 time frames (Repetition Time = 3s). The rest of the scan time includes rest periods at the beginning and the end of the run and the excluded first 3 volumes.

In total, in their experiment setting, they have 3 training sessions, which give us 24 fMRI scans in total. We skip the five repetition blocks in each fMRI scan, resulting in 1200 training pairs in total. Same for the testing sessions except they average the different repetitions for a higher SNR.

And I think there's no problem with your understanding of getting the mask in the array. True/False here means whether the voxel is part of a mask or not.

Let me know if there's any issue! :)