l294265421 / alpaca-rlhf

Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
https://88aeeb3aef5040507e.gradio.live/
MIT License
103 stars 13 forks source link

A question about setting tokens #12

Open hepj987 opened 1 year ago

hepj987 commented 1 year ago

why set tokenizer.pad_token_id = 0 ? llama model vocabl pad_token="<0x00>": 3 ,unk_token="": 0. Why not set it to 3 here? I think it should be set to tokenizer.pad_token_id = 3. I hope everyone can answer for me,thank

l294265421 commented 12 months ago

why set tokenizer.pad_token_id = 0 ? llama model vocabl pad_token="<0x00>": 3 ,unk_token="": 0. Why not set it to 3 here? I think it should be set to tokenizer.pad_token_id = 3. I hope everyone can answer for me,thank

tokenizer.pad_token_id = 0 is from the alpaca-lora project and works well. But, tokenizer.pad_token_id = 3 may be more reasonable.