batmanlab / Mammo-CLIP

Official Pytorch implementation of MICCAI 2024 paper (early accept, top 11%) Mammo-CLIP: A Vision Language Foundation Model to Enhance Data Efficiency and Robustness in Mammography
https://shantanu-ai.github.io/projects/MICCAI-2024-Mammo-CLIP/
Creative Commons Attribution 4.0 International
22 stars 6 forks source link

vinDR dataset #4

Closed emrekeles-arch closed 3 months ago

emrekeles-arch commented 3 months ago

Do you have a processed version of the vindr dataset? Since I am working on Colab, I do not have enough disk space to download and extract, so I cannot use this data set. I also took a look at the vindr dataset with png extension on Kaggle, but the images do not match the information in the csv file. If you have it and it won't cause any problems, can you share it via a drive link ?

shantanu-ai commented 3 months ago

Did you try our preprocessing steps for VinDr. U just need to download the files from official directory and preprocess it with the steps mentioned in the readme?

emrekeles-arch commented 3 months ago

The zipped version of the vindr data set is 50 GB, but when you unzip it, it becomes 340 GB. The disk space provided by Colab is limited to 100, so it is not possible to perform operations? Also, I don't have that much free space on my local computer? If it is not difficult for you to share, it would be very useful for me. If it is a difficult process, I will find an alternative way.

shantanu-ai commented 3 months ago

@emrekeles-arch , all the vindr png images are uploaded here.

emrekeles-arch commented 3 months ago

@shantanu-ai, I can't thank you enough, i am grateful.