[!CAUTION] This is a work-in-progress evaluation package for the MEDS-DEV benchmarking effort.
meds-evaluation label schema has five mandatory fields:
Models, when predicting this boolean_value
label, are allowed to use all data about a subject up to and including the prediction_time
.
The following pyarrow schema is expected by the meds-evaluation pipeline:
predicted_labels = pa.schema(
[
("subject_id", pa.int64()),
("prediction_time", pa.timestamp("us")),
("boolean_value", pa.bool_()),
("predicted_boolean_value", pa.bool_()),
("predicted_boolean_probability", pa.float64()),
]
)
PredictedLabel = TypedDict("Label", {
"subject_id": int,
"prediction_time": datetime.datetime,
"boolean_value": bool,
"predicted_boolean_value": bool,
"predicted_boolean_probability": float,
}, total=False)