OpenLLMAI / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
https://openrlhf.readthedocs.io/
Apache License 2.0
1.73k stars 164 forks source link

When DPO Yi-34B Assertion `srcIndex < srcSelectDimSize` failed #240

Closed victorShawFan closed 3 months ago

victorShawFan commented 3 months ago

when i use train_dpo_llama_34b.sh to dpo Yi-34B-Chat, there will be an "Array out of bounds" kind of problem

企业微信截图_367fcfdf-f47b-464d-a470-62defe645e51 企业微信截图_8f07f54a-2332-4201-9b1a-d4b86101487a 企业微信截图_0d17d8f0-1969-44bb-8a13-8f50fc216840

i use huggingface Yi-34B-Chat ckpt and tokenizer i didn't modify any code about tokenizers, please help

hijkzzz commented 3 months ago

Thank you for your feedback, we will reproduce and fix it as soon as possible.

hijkzzz commented 3 months ago

Please set use_fast=False in the openrlhf/utils/utils.py tokenizer = AutoTokenizer.from_pretrained(pretrain, trust_remote_code=True, use_fast=False, **sp_tokens)

hijkzzz commented 3 months ago

now you can use the option "--disable_fast_tokenizer" in train_dpo.py

victorShawFan commented 3 months ago

still don't work, same bug, what problem could that be?

hijkzzz commented 3 months ago

still don't work, same bug, what problem could that be?

use "--disable_fast_tokenizer"

victorShawFan commented 3 months ago

problem solved, now it worked, appreciate

image

btw, is this loss normal ?

hijkzzz commented 3 months ago

It makes more sense to look at acc mean