Closed dmahan93 closed 2 months ago
Still a bit of a WIP (docs, adding precomputation of reference logprobs) but figured it'd be good to get it up here now since it's a fairly big change for any discussions that are needed.
Also builds off https://github.com/EleutherAI/gpt-neox/pull/1240 and https://github.com/EleutherAI/gpt-neox/pull/1239 since the packing implementations/chat templating items are much more useful for DPO.
Still a bit of a WIP (docs, adding precomputation of reference logprobs) but figured it'd be good to get it up here now since it's a fairly big change for any discussions that are needed.