Closed victorShawFan closed 3 months ago
Thank you for your feedback, we will reproduce and fix it as soon as possible.
Please set use_fast=False
in the openrlhf/utils/utils.py tokenizer = AutoTokenizer.from_pretrained(pretrain, trust_remote_code=True, use_fast=False, **sp_tokens)
now you can use the option "--disable_fast_tokenizer" in train_dpo.py
still don't work, same bug, what problem could that be?
still don't work, same bug, what problem could that be?
use "--disable_fast_tokenizer"
problem solved, now it worked, appreciate
btw, is this loss normal ?
It makes more sense to look at acc mean
when i use train_dpo_llama_34b.sh to dpo Yi-34B-Chat, there will be an "Array out of bounds" kind of problem
i use huggingface Yi-34B-Chat ckpt and tokenizer i didn't modify any code about tokenizers, please help