Closed jimwu6 closed 4 months ago
Where do you see this needed? I'm pretty sure finetuning just uses the eos from the tokenizer.
Where do you see this needed? I'm pretty sure finetuning just uses the eos from the tokenizer.
It looks like it's one of the things **
ed into the superclass, I think there are some cases where omitting this causes an error. eg.
[rank2]: ValueError: sequence_id is a required argument when MPT is configured with attn_uses_sequence_id=True and the model is in train mode.
@milocress that should only be for pretraining style. finetuning style handles packing and sequence id on its own. e.g. https://github.com/mosaicml/llm-foundry/blob/fb9a2259e880b0baa3d3523ff42def9ea6c29ce3/llmfoundry/data/packing.py#L155
This is needed to allow the finetuning dataset to be constructed correctly.