jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Apache License 2.0
7.7k stars 453 forks source link

proper way to pad prompts #159

Closed HassanJbara closed 6 months ago

HassanJbara commented 7 months ago

I'm trying to fine-tune the model and for that I need to pad the dataset prompts. However, when I se left padding with the eos token, the outputs devolve into gibberish. Am I doing padding wrong? What is the solution here?

image

I'm using TinyLlama-1.1B-Chat-v1.0

jzhang38 commented 6 months ago

https://github.com/OpenAccess-AI-Collective/axolotl/tree/main/examples/tiny-llama

You can have a look at this fine-tuning framework where they provide examples for tinyllama.

HassanJbara commented 6 months ago

https://github.com/OpenAccess-AI-Collective/axolotl/tree/main/examples/tiny-llama

You can have a look at this fine-tuning framework where they provide examples for tinyllama.

I now realize that the problem was not passing the attention mask generated by the tokenizer. I have seen models with designated padding tokens who work just fine without attention masks, but I guess llama models need it.