Currently, the tuned lens trainer does not error or warn when it runs out of training data before reaching the requested number of steps. This, can be seen in our training run on Anthropic/hh-rlhf.
Ideally, this should error prior to training begin with something like insufficient data for steps requested. At the very least it should emit some kind of warning at the end of training.
Currently, the tuned lens trainer does not error or warn when it runs out of training data before reaching the requested number of steps. This, can be seen in our training run on
Anthropic/hh-rlhf
.Ideally, this should error prior to training begin with something like
insufficient data for steps requested
. At the very least it should emit some kind of warning at the end of training.