Something about the data

yangyan22 / Medical-Report-Generation-TriNet

Joint Embedding of Deep Visual and Semantic Features for Medical Image Report Generation

13 stars 0 forks source link

Something about the data #1

Open NorthTree22 opened 1 year ago

NorthTree22 commented 1 year ago

Hello, whether the data you are using in this code has already been processed

yangyan22 commented 1 year ago

Yes, we selected the cases with both frontal and lateral images, and we exclude cases without impression or findings sections. As a result, we obtain 2,939 studies (2,339 for training, 300 for validation and 300 for testing) in IU X-Ray. Similarly, we obtain 78,801 studies (62,801 for training, 8,000 for validation and 8,000 for testing) in MIMIC-CXR.

NorthTree22 commented 1 year ago

Ok, thanks a lot!

NorthTree22 commented 1 year ago

Hi, what is the 'img2othersFull.pkl', appears in TF-IDF/mesh_tag.py f = open('/media/camlab1/doc_drive/IU_data/images_R2_Ori/img2othersFull.pkl', 'rb')

yangyan22 commented 1 year ago

Already uploaded

NorthTree22 commented 1 year ago

Already uploaded

Thanks.

NorthTree22 commented 11 months ago

Hello, I have another question for the experimental data section," selected the cases with both frontal and lateral images," Could you tell me how to select the frontal and lateral images on the MIMIC-CXR data set? In addition, I have observed that there are a large number of images from other perspectives in the MIMIC-CXR dataset. In addition, why only compare data with both front and side views? Looking forward to your reply.

yangyan22 commented 11 months ago

Hi, the selection is done according to the metadata provided by the original dataset. There is a file "mimic-cxr-2.0.0-metadata.csv", and you may choose according the "ViewPosition". For the second question, this paper was written in the year 2019 but officially published in 2023. Before 2019, many works employed two view and we also used two views. Thank you for your attention on our paper!

NorthTree22 commented 11 months ago

Hi, the selection is done according to the metadata provided by the original dataset. There is a file "mimic-cxr-2.0.0-metadata.csv", and you may choose according the "ViewPosition". For the second question, this paper was written in the year 2019 but officially published in 2023. Before 2019, many works employed two view and we also used two views. Thank you for your attention on our paper!

Thank you for your reply. I fully understand.