NVIDIA / NeMo-Aligner

Scalable toolkit for efficient model alignment
Apache License 2.0
419 stars 45 forks source link

add fix for missing extra state in rm #214

Closed gshennvm closed 2 weeks ago

gshennvm commented 2 weeks ago

fix issue when reward model is trained with 24.01 but loaded with 24.05