issues
search
pytorch
/
rl
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
https://pytorch.org/rl
MIT License
2.01k
stars
269
forks
source link
[Feature Request] multi-turn reward for RLHF
#2271
Open
vmoens
opened
4 days ago
vmoens
commented
4 days ago
Implement rewards as proposed in
https://arxiv.org/pdf/2405.14655
Implement rewards as proposed in https://arxiv.org/pdf/2405.14655