predicted_labels = pa.schema(
[
# Required
("subject_id", pa.int64()),
("prediction_time", pa.timestamp("us")),
# Optional (you must have all three for prediction)
("boolean_value", pa.bool_()),
("predicted_boolean_value", pa.bool_()),
("predicted_boolean_probability", pa.float64()),
# Optional
("embeddings", pa._list(pa.float64())),
("logits_sequence", pa._list(pa._list(pa.float64()))),
("logits", pa._list(pa.float64())),
("loss", pa.float64()),
]
)
Also you should validate schemas as described here.
TODOS
[x] Add supervised model support
[x] Add early_fusion support so we can get logits around a prediction time - this will enable autoregressive workflows where a user may want to observe logits after a prediction time in the teacher forcing setting (#30 ).
Add a general predict script for all models that allows storing embeddings, logits, loss and predictions in a meds-evaluation compliant schema.
This is the meds-eval schema:
I will add optional logits, embeddings, and loss:
Also you should validate schemas as described here.
TODOS