EdoardoPona / predicting-inductive-biases-RL

fork of https://openreview.net/forum?id=mNtmhaDkAr - extending for inductive bias in RL
1 stars 0 forks source link

default number of steps between rl4lm and trl is drastically different #27

Closed EdoardoPona closed 1 year ago

EdoardoPona commented 1 year ago

this might be the reason for performance differences

trl also seems to take much longer on small models...

EdoardoPona commented 1 year ago

number of steps refers to total steps, across the whole train run. in most cases this isn't reached, or used. could be clearer