magpie-align / magpie

Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!
https://magpie-align.github.io/
MIT License
418 stars 43 forks source link

Update model_configs.json #18

Closed andresuribe87 closed 1 month ago

andresuribe87 commented 1 month ago

This updates the model config to what I believe were typo's.

@fly-dust I'm curious where the definition of the stop_tokens comes from? I tried searching online but couldn't figure it out :(

fly-dust commented 1 month ago

Hi, Thank you! This is indeed a typo.

Stop tokens are used for vllm, and here is the detailed description in vllm docs (see stop): https://docs.vllm.ai/en/v0.5.0/dev/sampling_params.html

The reason for putting so many stop tokens into the generation pipeline is that sometimes the LLM will not generate <|eot_id|> after generating instructions. So we put all special tokens here to have a higher chance of letting it stop :)