ratschlab / HIRID-ICU-Benchmark

Repository for the HiRID ICU Benchmark (HiB) project
MIT License
49 stars 10 forks source link

common_stage: arterial measurement labels #19

Closed michela-meister closed 1 year ago

michela-meister commented 1 year ago

Hello, We are trying to work with data on oxygen saturation in arterial blood (concept ID 20000800 the hirid variable reference spreadsheet). In the appendix of the original HiRID paper, the authors explain that the raw data includes venous measurements that were incorrectly labeled as arterial, and they had to correct these measurements during preprocessing (which they do here with with the function change_arterial_to_venous). Do the common_stage matrices include the correct labels for arterial measurements?

hugoych commented 1 year ago

Hi Michela, Thanks for using our benchmark!

The common stage matrices are not corrected for possible arterial/venous artifacts beyond out-of-range removal from the varref.tsv table as other variables. This is because the pipeline up to the common stage aims to not depend on any design/modeling choices as discussed in the paper. Indeed the change_arterial_to_venous function, from the original repo, relies on a given threshold parameter to define mislabeled arterial S02 measurements. Because this only affects less than 400 measurements out of more than 3M, we preferred not to include this step. For the same reason, we only removed out-of-range height and weight and did not include the correct_weight_height function from the original repo. I hope this answers your question!