[FEATURE] Option to add BOS token

h2oai / h2o-llmstudio

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/

https://h2o.ai

Apache License 2.0

4k stars 417 forks source link

[FEATURE] Option to add BOS token #636

Open psinger opened 8 months ago

psinger commented 8 months ago

🚀 Feature

Similar to EOS token, we should offer an option to add BOS token to the beginning. Might be useful for models like Gemma.

tmostak commented 6 months ago

I believe this is necessary for getting the best results when fine-tuning Llama 3, although there seems to be some confusion (https://huggingface.co/meta-llama/Meta-Llama-3-8B/discussions/9).

tmostak commented 6 months ago

Just as a follow-up, I implemented a hacky version of this to help with training Llama 3, and indeed adding BOS tokens to prompts and answers when fine-tuning the Llama 8B base model lowered my loss by a small but significant margin (tried many different seeds to ensure it was reproducible)

psinger commented 6 months ago

You can always just hardcode the bos token string to the prompt separator.

Although I am personally not convinced it can have a big impact for finetuning.

We should still add an option to add it.

tmostak commented 6 months ago

Yes good point... still seems desirable to have native support in the UX though.