Open kldzj opened 3 weeks ago
Try using the [/INST]
as the response part, that is what I am currently doing. Although I am not sure if this would work for multi-turn chats.
trainer = train_on_responses_only(
base_trainer,
instruction_part='[INST] ',
response_part='[/INST] '
)
tokenizer.decode(trainer.train_dataset[0]["input_ids"])
[INST] Very long content… [/INST] Assistant answer…'
space = tokenizer(' ', add_special_tokens = False).input_ids[0] tokenizer.decode([space if x == -100 else x for x in trainer.train_dataset[0]["labels"]])
' Assistant answer…'
Thanks a lot for your input! :)
I'll give it another shot, but I think I tried this combination before. It's crucial that it works with a multi-turn dataset.
train_on_responses_only
expectsinstruction_part
andresponse_part
, which seems to not work with the Mistral chat template.Whenever I try some kind of
[INST]
combination, the spaced decode is always just a new-line for me. Perhaps I'm doing it wrong.Is it possible to train on responses only with the Mistral chat template? If so, could you kindly provide a example? :)