ttanida / rgrg

Code for the CVPR paper "Interactive and Explainable Region-guided Radiology Report Generation"
MIT License
126 stars 23 forks source link

Comparison in MIMIC-CXR dataset #16

Open Markin-Wang opened 11 months ago

Markin-Wang commented 11 months ago

Hi, thanks for your work.

I have a question about the comparison to previous works in MIMIC-CXR dataset.

Previous methods in report generation utilized the official MIMIC-CXR data split to report the report generation results.

Nonetheless, your work uses the Chest ImaGenome v1.0.0 data split which is different from the MIMIC-CXR data split.

Therefore, rgrg in report generation experiments seems not comparable to previous works?

I am grateful if you could provide more information on this and sorry if I misunderstand the testing procedure.

ttanida commented 11 months ago

Hi,

Thank you for your question.

You're right in noting that we utilized the Chest ImaGenome v1.0.0 split instead of the MIMIC-CXR split. However, since both splits come from the same underlying dataset, they should inherently have a similar data distribution, ensuring the comparability of our results with previous studies.

Best, Tim

Markin-Wang commented 11 months ago

Hi,

Thank you for your question.

You're right in noting that we utilized the Chest ImaGenome v1.0.0 split instead of the MIMIC-CXR split. However, since both splits come from the same underlying dataset, they should inherently have a similar data distribution, ensuring the comparability of our results with previous studies.

Best, Tim

Hi Tim,

Thank you for your reply. However, I respectfully disagree with you claim as the dataset split in MIMIC-CXR seems not random. The data distribution seems a bit different from the training and validation set. For example, in the paper releasing the dataset, they mentioned that "The test set contains all studies for patients who had at least one report labelled in our manual review." In addition, as shown in Table 3, only ~69% patients in the training /val set have the findings, while this figure is 98.3% in the test set. Moreover, the average length of report on the MIMIC-CXR split test set is 66.4 while 53 and 53.05 in the training and validation set as shown in paper.

fuying-wang commented 10 months ago

Thanks for the awesome work!

I have also noticed that previous splits contain lateral view images, which may also make the data distribution slightly different.