Phi 3 Mini 128K leads to Tokenization Mismatch

mbzuai-oryx / LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

809 stars 59 forks source link

Phi 3 Mini 128K leads to Tokenization Mismatch #34

Open ritwickchaudhry opened 3 months ago

ritwickchaudhry commented 3 months ago

Hi! Thanks for the amazing work. I am trying to use the Phi 3 Mini 128K model. Unfortunately, I get a tokenization mismatch error (relevant code). However, it gives an error even with the 4K model. Can you please guide on why the issue exists and/or the changes in preprocessing code that I need to do to support this? I think it's mainly got to do with the change in Phi 3 models made in July

ZichenMiao commented 2 months ago

Same problem here.

arvillion commented 3 weeks ago

I encountered the same type of error when training with phi-3 mini-4k model. Later I changed the following lines in train.py and conversations.py respectively and it seemed to work well.

# def preprocess_phi3(
- else:
-     round_len -= 2
-     instruction_len -= 2
+ else:
+     round_len += 1
+     instruction_len += +1

# conv_phi3_instruct = Conversation(
- roles=("\n<|user|>\n", "\n<|assistant|>\n"),
+ roles=("<|user|>", "<|assistant|>"),