Enable multiple validation datasets for reward model training. Metrics are computed individually on each dataset, and logged in separate wandb tabs.
Usage
You can add validation sets as new keys in data_prefix. These keys should start with "validation" to be taken into account as an optional validation set.
What does this PR do ?
Enable multiple validation datasets for reward model training. Metrics are computed individually on each dataset, and logged in separate wandb tabs.
Usage
You can add validation sets as new keys in
data_prefix
. These keys should start with "validation" to be taken into account as an optional validation set.