HumanSignal / RLHF

Collection of links, tutorials and best practices of how to collect the data and build end-to-end RLHF system to finetune Generative AI models
154 stars 31 forks source link

Example/rlhf nb #1

Closed JimmyWhitaker closed 1 year ago

JimmyWhitaker commented 1 year ago

Created RLHF notebook with trlx, given a dataset from Label Studio