Open p-ferreira opened 1 year ago
The tokenizer has to match the reward model and so this cannot be changed in isolation. It's basically the same as just changing the entire reward model.
We are using specific hyperparams in the tokenizer for things like padding and max sequence length. These could be investigated further.
# mirror_neuron/sources/reward.py
encodings_dict = self.tokenizer(
sub_samples,
truncation=False,
max_length=550,
padding="max_length",
return_tensors="pt",
)
We should make these configurable form the config file and not modify source code!
We want the reward model tokenizer to be dynamic to the config file, where the path to the tokenizer would be set in the
.yml
config alongside the other reward model configs.