openai / weak-to-strong

MIT License
2.49k stars 302 forks source link

Support LM head finetuning #29

Closed AlexTMallen closed 8 months ago

AlexTMallen commented 8 months ago

If the dataset formatting function contains a choices key, which contains a pair of string choices, then finetuning will use the existing LM head rather than training one anew.

The logits at the first token id of each of these choices will be returned from the model for binary classification.

This should make the fine-tuning maximally natural, hopefully aiding weak to strong generalization.