razorx89 / roco-dataset

Radiology Objects in COntext (ROCO): A Multimodal Image Dataset
175 stars 19 forks source link

Question regarding ROCOv2 #17

Closed gufranSabri closed 3 months ago

gufranSabri commented 7 months ago

I had some questions regarding ROCOv2.

  1. What is the difference between train_concepts_manual.csv and train_concept.csv
  2. ROCO had keywords but ROCOv2 doesn't. What is the best way I can generate keywords for ROCOv2.
saviola777 commented 7 months ago
  1. train_concepts_manual.csv contains manually curated concepts for modality (all images except combined modalities) and IRMA class (X-ray only), train_concept.csv additionally contains concepts extracted by an automated pipeline based on the caption.
  2. The best way to generate keywords is to take the cui_mapping.csv and replace the CUIs in the *_concepts.csv with the mapped term.
gufranSabri commented 7 months ago

Thank you for your reply! I plan on using ROCOv2 for my research.

How can i cite the dataset in my paper? Should I just cite the ROCO paper: https://link.springer.com/chapter/10.1007/978-3-030-01364-6_20

saviola777 commented 7 months ago

The ROCOv2 paper is currently under review and will be released within the next months. For now, you can use the citation given at Zenodo in addition to the ROCO paper:

Johannes Rückert, Louise Bloch, Raphael Brüngel, Ahmad Idrissi-Yaghir, Henning Schäfer, Cynthia S. Schmidt, Sven Koitka, Obioma Pelka, Asma Ben Abacha, Alba Garcia Seco de Herrera, Henning Müller, Peter A. Horn, Felix Nensa, & Christoph M. Friedrich. (2023). ROCOv2: Radiology Objects in COntext Version 2, An Updated Multimodal Image Dataset (2.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8333645

Note that the license of the dataset was changed from CC BY to CC BY-NC.

saviola777 commented 3 months ago

New citation: https://arxiv.org/abs/2405.10004