About the NSD data. - Githubissues

zhangdanfeng888 commented 8 months ago

Thank you very much for your work, I would like to know which part of NSD data is needed for this project, how to preprocess the NSD data, and how to organize the image stimuli in the dataset into a .npy file with specified dimensions. Is there any code for these questions? I will be eternally grateful.

ReedOnePeck commented 8 months ago

For data downloading and preprocessing, I provide you with two options. The first option is to follow the data downloading and preprocessing process provided by this project (https://github.com/ozcelikfu/brain-diffuser). The code is well-established and has been verified to run smoothly. Both our project and his use the first, second, fifth, and seventh subjects, and the training set and test set are divided in the same way.

If you have difficulties with data downloading and preprocessing, the second option is to download this competition dataset (https://colab.research.google.com/drive/1bLJGP3bAo_hAOwZPHpiSHKlt97X9xsUw?usp=share_link#scrollTo=Cuc-YDrzxWPf) which provides preprocessed NSD data((https://docs.google.com/forms/d/e/1FAIpQLSehZkqZOUNk18uTjRTuLj7UYmRGz-OkdsU25AyO3Wm6iAb0VA/viewform?edit2=2_ABaOnufvGlANXIdmv_QVzx3cuXu8CIb8egL9KL9xIpZV0TN-fbLtlMPjS3mVs4dOvA),). However, it only provides the complete training set (you can divide it into a new test set yourself), and it should be noted that the division of the training and test sets in this dataset is not consistent with the division in our paper.

ReedOnePeck commented 8 months ago

For the "specified" dimensions, we just resized the pictures to (512,512,3) and rearranged the dimensions.

img_dir = target_dir + 'Test/' img_name = source_dir + img_names[i] img = Image.open(img_name) out = img.resize((512, 512)) Train_Images.append(np.array(out) / 255.)

Trn = rearrange(torch.tensor(np.array(Train_Images), dtype=torch.float32), "b h w c -> b c h w")

zhangdanfeng888 commented 8 months ago

For data downloading and preprocessing, I provide you with two options.对于数据下载和预处理，我为您提供了两个选项。 The first option is to follow the data downloading and preprocessing process provided by this project (https://github.com/ozcelikfu/brain-diffuser). The code is well-established and has been verified to run smoothly. Both our project and his use the first, second, fifth, and seventh subjects, and the training set and test set are divided in the same way.第一种选择是遵循本项目提供的数据下载和预处理过程（https://github.com/ozcelikfu/brain-diffuser）。该代码已建立，并已验证可以顺利运行。我们的项目和他都使用第一、第二、第五和第七个科目，训练集和测试集的划分方式相同。

If you have difficulties with data downloading and preprocessing, the second option is to download this competition dataset (https://colab.research.google.com/drive/1bLJGP3bAo_hAOwZPHpiSHKlt97X9xsUw?usp=share_link#scrollTo=Cuc-YDrzxWPf) which provides preprocessed NSD data((https://docs.google.com/forms/d/e/1FAIpQLSehZkqZOUNk18uTjRTuLj7UYmRGz-OkdsU25AyO3Wm6iAb0VA/viewform?edit2=2_ABaOnufvGlANXIdmv_QVzx3cuXu8CIb8egL9KL9xIpZV0TN-fbLtlMPjS3mVs4dOvA),). However, it only provides the complete training set (you can divide it into a new test set yourself), and it should be noted that the division of the training and test sets in this dataset is not consistent with the division in our paper.如果您在数据下载和预处理方面遇到困难，第二种选择是下载此竞赛数据集（https://colab.research.google.com/drive/1bLJGP3bAo_hAOwZPHpiSHKlt97X9xsUw?usp=share_link#scrollTo=Cuc-YDrzxWPf），它提供预处理的NSD数据（（https://docs.google.com/forms/d/e/1FAIpQLSehZkqZOUNk18uTjRTuLj7UYmRGz-OkdsU25AyO3Wm6iAb0VA/viewform?edit2=2_ABaOnufvGlANXIdmv_QVzx3cuXu8CIb8egL9KL9xIpZV0TN-fbLtlMPjS3mVs4dOvA），）。但是，它只提供了完整的训练集（你可以自己把它划分成一个新的测试集），需要注意的是，这个数据集中训练集和测试集的划分与我们论文中的划分并不一致。

Thans for response. Is this code for preprocessing NSD data? (https://github.com/styvesg/nsd/blob/master/data_preparation.ipynb)

As described in MindDiffuser github: (After preprocessing the NSD data, please organize the image stimuli in the training set into a .npy file with dimensions (8859, 3, 512, 512)), is this code(data_preparation.ipynb) you're using to preprocess the NSD data？

ReedOnePeck commented 8 months ago

Yes, but this code is very messy. I suggest you refer to this project (https://github.com/ozcelikfu/brain-diffuser)

ReedOnePeck commented 8 months ago

As described in MindDiffuser github: (After preprocessing the NSD data, please organize the image stimuli in the training set into a .npy file with dimensions (8859, 3, 512, 512)), is this code(data_preparation.ipynb) you're using to preprocess the NSD data？

No, For the code to process stimulating images, please refer to my second reply. It's very simple. Simply read each image in order, resize each image, and finally modify the dimensions using the rearrange function.

zhangdanfeng888 commented 8 months ago

Yes, but this code is very messy. I suggest you refer to this project (https://github.com/ozcelikfu/brain-diffuser)是的，但是这段代码非常混乱。我建议你参考这个项目（https://github.com/ozcelikfu/brain-diffuser）

Okay，is this code in the project（https://github.com/ozcelikfu/brain-diffuser） that preprocesses NSD data? After I run this code, I need to organize the image stimuli in the training set into a .npy file with dimensions ..., right?

ReedOnePeck commented 8 months ago

Yes, this code has standardized the dimensions of the stimulus image to b h w c. You only need to preprocess it and then change it to b c h w on your own. 1704857130883

zhangdanfeng888 commented 8 months ago

Yes, this code has standardized the dimensions of the stimulus image to b h w c. You only need to preprocess it and then change it to b c h w on your own.是的，此代码已将刺激图像的尺寸标准化为 b h w c。你只需要对它进行预处理，然后自己把它改成b c h w。

Ok, I will try it, thanks so much~~~

ReedOnePeck commented 8 months ago

Good luck, if you have any other questions, please feel free and let me know.

zhangdanfeng888 commented 8 months ago

Good luck, if you have any other questions, please feel free and let me know.

Hello，I have another question about COCO datasets. I downloaded the annotations_trainval2017 datasets, including "captions_train2017.json" and "captions_val2017.json" file. But I can't find the "captions_test2017.json" from the official website of the COCO dataset, do you mean "captions_val2017.json" here? Or where is the "captions_test2017.json" in official website of the COCO dataset?

ReedOnePeck commented 8 months ago

I'm sorry for my carelessness, it's "captions_val2017.json"

zhangdanfeng888 commented 8 months ago

I'm sorry for my carelessness, it's "captions_val2017.json"

Ok, thanks a lot.

zhangdanfeng888 commented 7 months ago

@ReedOnePeck Sorry to bother you again. I can only find sd-v1-4.ckpt from stable-diffusion-v-1-4-original: https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/tree/main, but I can't find v1-inference.yaml form this website, but I can find v1-inference.yaml from stable-diffusion-v1-5: https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/main. Can you give me the website of sd-v1-4.ckpt and the config file :v1-inference.yaml for Stable Diffusion v1-4 ? Thans a lot.

zhangdanfeng888 commented 7 months ago

@ReedOnePeck Sorry to bother you again. I can only find sd-v1-4.ckpt from stable-diffusion-v-1-4-original: https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/tree/main, but I can't find v1-inference.yaml form this website, but I can find v1-inference.yaml from stable-diffusion-v1-5: https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/main. Can you give me the website of sd-v1-4.ckpt and the config file :v1-inference.yaml for Stable Diffusion v1-4 ? Thans a lot.

I am not sure if I am right about sd-v1-4.ckpt and v1-inference.yaml.

ReedOnePeck commented 7 months ago

I don't know why v1-inference.yaml isn't available here, but I just uploaded the file to my project and you can download it from v1-inference.yaml.

ReedOnePeck / MindDiffuser

About the NSD data. #4