Closed Luo-Z13 closed 6 months ago
Hi @Luo-Z13,
Thank you for your interest in our work. Could you please please confirm which base LLM
and --version
value you are using?
This issue arises when either you are using wrong base LLM or have set a different value for --version
. Please make sure to use meta-llama/Meta-Llama-3-8B-Instruct as base model for LLaMA-3 based trainings. And microsoft/Phi-3-mini-4k-instruct as base model for Phi-3 based trainings.
Let me know if the issue persists. Thank You
Hi @Luo-Z13,
Thank you for your interest in our work. Could you please please confirm which
base LLM
and--version
value you are using?This issue arises when either you are using wrong base LLM or have set a different value for
--version
. Please make sure to use meta-llama/Meta-Llama-3-8B-Instruct as base model for LLaMA-3 based trainings. And microsoft/Phi-3-mini-4k-instruct as base model for Phi-3 based trainings.Let me know if the issue persists. Thank You
Thank you for the reminder, I have found the reason: llama-3 has once updated the tokenizer_config.json
file. I downloaded the version from April 15th, but now I have updated it to the latest version, and everything is working fine now. Once again, really appreciate your patient response!
Hi @Luo-Z13,
pip's dependency
can be ignored.TypeError: pad_sequence(): argument 'padding_value' (position 3) must be float, not NoneType
occurs duringLLaMA-3
based model training. Actually LLaMA-3 does not use anypad
token however during LLaVA-LLaMA-3 training we need pad token. So the workaround is to add a special token and resize the embeddings. This is done at https://github.com/mbzuai-oryx/LLaVA-pp/blob/b93d9c8d8539e794fc79a867aae08c4d7b3b6de7/LLaMA-3-V/train.py#L1015.Please make sure that baseline official LLaVA code is working properly. And then make sure to copy all the files related to LLaMA-3 in the corresponding directory. Lastly please note that to run LLaMA-3 based training you need to pass
--version llama3
.I hope it will help and solve the issue. Good Luck.
Originally posted by @mmaaz60 in https://github.com/mbzuai-oryx/LLaVA-pp/issues/8#issuecomment-2088868773
Thank you very much, my previously reported
TypeError: pad_sequence(): argument 'padding_value' (position 3) must be float, not NoneType
issue has been resolved after correctly copying the right train.py file. Thanks for your advice on that matter.However, I still encounter
tokenization mismatch
issue during training, my current environment:And the beginning of the training output is as follows: