issues
search
openpsi-project
/
ReaLHF
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Apache License 2.0
82
stars
4
forks
source link
Add the GRPO algorithm.
#31
Closed
garrett4wade
closed
2 months ago
garrett4wade
commented
2 months ago
Corresponding changes:
Remove the
num_samples
field in generation config since it's not used anywhere else.
Corresponding changes:
num_samples
field in generation config since it's not used anywhere else.