Add a predict script that generates embeddings

Add a general predict script for all models that allows storing embeddings, logits, loss and predictions in a meds-evaluation compliant schema.

This is the meds-eval schema:

predicted_labels = pa.schema(
    [
        ("subject_id", pa.int64()),
        ("prediction_time", pa.timestamp("us")),
        ("boolean_value", pa.bool_()),
        ("predicted_boolean_value", pa.bool_()),
        ("predicted_boolean_probability", pa.float64()),
    ]
)

I will add optional logits, embeddings, and loss:

predicted_labels = pa.schema(
    [
        # Required
        ("subject_id", pa.int64()),
        ("prediction_time", pa.timestamp("us")),
        # Optional (you must have all three for prediction)
        ("boolean_value", pa.bool_()),
        ("predicted_boolean_value", pa.bool_()),
        ("predicted_boolean_probability", pa.float64()),
        # Optional
        ("embeddings", pa._list(pa.float64())),
        ("logits_sequence", pa._list(pa._list(pa.float64()))),
        ("logits", pa._list(pa.float64())),
        ("loss", pa.float64()),
    ]
)

Also you should validate schemas as described here.

TODOS

[x] Add supervised model support
[x] Add early_fusion support so we can get logits around a prediction time - this will enable autoregressive workflows where a user may want to observe logits after a prediction time in the teacher forcing setting (#30 ).
[x] Validate Schemas

Oufattole / meds-torch

Add a predict script that generates embeddings #109

TODOS