CarperAI / trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
MIT License
4.51k stars 471 forks source link

TRLX Environment customization #593

Open heraldiclily opened 7 months ago

heraldiclily commented 7 months ago

I am currently working with TRLX library for reinforcement learning and have a few questions regarding the customization of the learning process: