Liang-Jiaying / RLAIF

MIT License
2 stars 0 forks source link

code demo try to replicate what the paper did #6

Closed Liang-Jiaying closed 1 year ago

Liang-Jiaying commented 1 year ago

Since to fine tune and having human feedbacks for reinforcement learning are not achievable on my computer (I believe also in others' laptop), I am not able to complete this code demo.