chaoyi-wu / RadFM

The official code for "Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data".
315 stars 32 forks source link

Image Format Inconsistency between the Codes and the Downloaded Disease Diagnosis Datasets #25

Closed jianghao-zhang closed 5 months ago

jianghao-zhang commented 5 months ago

Hi, thanks for sharing such a meaningful work!

I encountered an issue while testing the downloaded disease diagnosis datasets (e.g., Vindr-SpineXR, Vindr-PCXR, and Vindr-Mammo). The image formats in these datasets are ".dicom", whereas the .csv file you uploaded in Hugging Face uses the ".png" format. I have attached a screenshot to illustrate this issue:

image

Given this situation, I kindly request your assistance in providing a straightforward guide or instructions on converting the ".dicom" files to ".png" format. This guidance would be immensely beneficial not only to me but potentially to other users who might encounter the same challenge.

manuel-tran commented 5 months ago

I have the same question, my dataset consists of ".dcm" images. What is the step-by-step process to convert them to ".png" format? This is crucial to get the same input distribution.

chaoyi-wu commented 5 months ago

Thanks for your reminder. Sorry for missing this file. We have added the converting python code in https://github.com/chaoyi-wu/RadFM/blob/main/src/Dataset/dataset/dicom_to_png_for_VinDR_sampled_using_mammo.py. The file is about Vindr-mammo and for other dataset the code is mostly the same. Hope this can help you.

jianghao-zhang commented 5 months ago

Thanks for your reminder. Sorry for missing this file. We have added the converting python code in https://github.com/chaoyi-wu/RadFM/blob/main/src/Dataset/dataset/dicom_to_png_for_VinDR_sampled_using_mammo.py. The file is about Vindr-mammo and for other dataset the code is mostly the same. Hope this can help you.

Thanks for your efforts!