baeseongsu / ehrxqa

EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images
MIT License
57 stars 3 forks source link

the categories for modality-base (Image, Table, Image+Table) and patient-based scope (none, single, group) #6

Closed nooralahzadeh closed 2 months ago

nooralahzadeh commented 2 months ago

Hello,

Thanks for this great work. I wonder if any identifiers in the dataset indicate the categories for modality-base (Image, Table, Image+Table) and patient-based scope (none, single, group)?

Thanks

baeseongsu commented 2 months ago

Hi @nooralahzadeh ,

Currently, we do not support direct identifiers in the dataset. However, you can try using the code from these links:

  1. https://github.com/baeseongsu/ehrxqa/blob/724bff13a9a2e430ecd54f32e3ef3789bb7fcdb3/tests/test_utils.py#L45-L67
  2. https://github.com/baeseongsu/ehrxqa/blob/724bff13a9a2e430ecd54f32e3ef3789bb7fcdb3/tests/test_utils.py#L132-L148

You can filter the dataset using the scope argument. The possible values for the scope are:

The scope argument uses string matching at the start, so you can also use broader categories like "IMAGE-SINGLE" or "IMAGE-GROUP" (or even "IMAGE") to include all relevant subcategories.

Sorry for the inconvenience.

Best, Seongsu

nooralahzadeh commented 2 months ago

Thanks.