Zero-shot evaluation - Githubissues

We will use tensorized data format over MEDS-like format for zero-shot evaluation. ESGPT's labeler implementation provides GPU-accelerated label generation. While the alternative MEDS-like format would offer better interpretability for time-based tasks through Polars dataframes, it requires CPU processing and data transfers that would limit our sampling scalability.

To enable this we should use the ESGPT Labeler abstraction and there is an example here to allow user defined tasks.

The zero-shot script should require for input args

A task following the meds-label schema with the ground truth binary classification labels (already implemented in the pytorch_dataset class)
A user defined labeler is provided
A pretrained LM checkpoint (already implemented in src/meds_torch/finetune.py)

The script should:

[x] Include a TTE labeler class that inherits this and takes in a list of codes and binary threshold for the time to that code occuring. Additionally support including a binary cutoff for this. (This should suffice for mortality and LOS prediction tasks which is all I care for right now)
[x] Test that the prediction script will generate the zero-shot predictions.
[ ] for zeroshot and general generation we should allow two budgets (1) a time-based budget and (2) max_seq_len budget (which is already supported)

Oufattole / meds-torch

Zero-shot evaluation #105