CarperAI / trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
MIT License
4.51k stars 471 forks source link

Inference pipeline #555

Open Dahoas opened 1 year ago

Dahoas commented 1 year ago

Implementation of multi-generation RL in trlX

Suggested (but optional) external inference pipeline wrapper can be found here

Dahoas commented 1 year ago

Depends on #529