Borrow from OpenAI/weak-to-strong

abdurraheemali commented 11 months ago

https://github.com/openai/weak-to-strong changes a lot of tasks into binary prediction, we could take a subset of those where the prediction can influence the outcome, load small pretrained models, do full finetuning with the zero sum training method, and measure what effect it has.

abdurraheemali commented 11 months ago

some materials also in https://github.com/johannestreutlein/scoring-rules-performative

abdurraheemali commented 11 months ago

https://github.com/jcperdomo/performative-prediction/blob/main/experiments/neurips2020/stochastic-credit-simulator.ipynb

abdurraheemali commented 11 months ago

the training script they have is

python train_weak_to_strong.py --batch_size 32 --max_ctx 512 --ds_name "sciq" --loss "logconf" --n_docs 1000 --n_test_docs 100 --weak_model_size "gpt2-medium" --strong_model_size "gpt2-large" --seed 42

our changes:

weak model will be the same as strong model
we will estimate whatever can be trained on 1xH100 on short notice and just do those (I have runpod credits but might be easier to use modal labs)
gotta filter to a subset of the datasets (not clear on this yet)

abdurraheemali / historical-action-predictor

Borrow from OpenAI/weak-to-strong #8