First off, thanks for providing training code for mamba use cases.
I was looking at how the training for mamba-chat is done, something I'm unclear on is the "preprocess" function used in the class "ChatDataset" (in "/trainer/data.py"). Why does it return a dictionary with only the input ids and not labels
I'm a little confused, wouldn't we need the data of both the user and assistant to train a chatbot? I notice this same pattern a few other times in the training code so I wanted to ask
Hi,
First off, thanks for providing training code for mamba use cases.
I was looking at how the training for mamba-chat is done, something I'm unclear on is the "preprocess" function used in the class "ChatDataset" (in "/trainer/data.py"). Why does it return a dictionary with only the input ids and not labels
dict(input_ids = all_input_ids, labels=all_input_ids)
I'm a little confused, wouldn't we need the data of both the user and assistant to train a chatbot? I notice this same pattern a few other times in the training code so I wanted to ask
Thanks!