add fix for missing extra state in rm

NVIDIA / NeMo-Aligner

Scalable toolkit for efficient model alignment

Apache License 2.0

419 stars 45 forks source link

Closed gshennvm closed 2 weeks ago

gshennvm commented 2 weeks ago

fix issue when reward model is trained with 24.01 but loaded with 24.05