outputs_label_list contains the entries "Chest Tube" and "Jackson Pratt", but these never appear as labels, the correct labels are "Chest Tube #1" and "Jackson Pratt #1"
prescriptions.ipynb - missing required filtering
rows with non-float dose_val_rx are not dropped
rows with NaT entries in starttime are not dropped
inputevents.ipynb
The code for adding repeats does in some cases not add enough repeats due to a rounding issue. This can be tested via
We filter for patients with a single admission, however later in the other dataframes hadm_id is used as filter instead of subject_id. The issue is that there appears to be corrupted data in at least one table that gives rise to hadm_id with multiple subject_id associated with it. We can test it in datamerging via
Further, the hospital stay is limited to patients with 2-29 days stay. However, the charttime does not agree with this data. Sometimes, charttime starts before admittime. The longest charttime is over 52 years.
I was reproducing the preprocessing and I noticed a few severe issues with the preprocessing provided.
datamerging.ipynb
- Prescriptions are accidentally dropped completelyafterwards, the table is empty.
outputs.ipynb
- Wrong labels!outputs_label_list
contains the entries"Chest Tube"
and"Jackson Pratt"
, but these never appear as labels, the correct labels are"Chest Tube #1"
and"Jackson Pratt #1"
prescriptions.ipynb
- missing required filteringdose_val_rx
are not droppedNaT
entries instarttime
are not droppedinputevents.ipynb
The code for adding repeats does in some cases not add enough repeats due to a rounding issue. This can be tested via
labevents.ipynb
valuenum
are not droppedadmissions.ipynb
We filter for patients with a single admission, however later in the other dataframes
hadm_id
is used as filter instead ofsubject_id
. The issue is that there appears to be corrupted data in at least one table that gives rise tohadm_id
with multiplesubject_id
associated with it. We can test it in datamerging viaFurther, the hospital stay is limited to patients with 2-29 days stay. However, the
charttime
does not agree with this data. Sometimes,charttime
starts beforeadmittime
. The longestcharttime
is over 52 years.