batmanlab / BatmanLabWiki

Documents and Wiki of the lab
Apache License 2.0
3 stars 0 forks source link

[Multi-Modal Image+Text] Provide 2d images from the UPMC dataset with report containing 14 disease term in CheXNet paper #62

Closed sumedhasingla closed 6 years ago

sumedhasingla commented 6 years ago
sumedhasingla commented 6 years ago

Number of subjects with both image+text: 4456/7870 Number of subjects with both image+text and have positive disease label (at-least 1 in 14 disease: 3890/4456 Number of subjects with both image+text and have no disease label: 566/4456

sumedhasingla commented 6 years ago

The image files are saved at location: '/pghbio/dbmi/batmanlab/singla/Image_Text_Project/Data_Image_Text'

sumedhasingla commented 6 years ago

The excel file containing the label information: '/pghbio/dbmi/batmanlab/singla/Image_Text_Project/RAD-ALL-Findings-Impressions_ChestXLabels.csv'

columns:

sumedhasingla commented 6 years ago

@pyadolla Can you please verify the data and let me know if there are any issues.

sumedhasingla commented 6 years ago

The path to the raw images corresponding to the reports is saved in excel: '/pghbio/dbmi/batmanlab/Data/radiologyTextDataset2/singla/RAD-ALL-List-ExamPath.csv'

sumedhasingla commented 6 years ago

The source directory for images: '/pghbio/dbmi/batmanlab/Data/UPMC_Lung_Images/ftp.box.com

sumedhasingla commented 6 years ago

Git hub link: https://github.com/sumedhasingla/MultiModalImageText.git

kayhan-batmanghelich commented 6 years ago

Thanks @sumedhasingla for the update.