mbzuai-oryx / LLaVA-pp

πŸ”₯πŸ”₯ LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
813 stars 61 forks source link

Using phi-3 and LLava but some fields of phi3 network not support #7

Closed hellangleZ closed 6 months ago

hellangleZ commented 6 months ago

Hi team:

Every steps are copied your wizard to do

Like same dataset, and use the newest repo also upload llava to neweast

Also copy every new python files to Llava folder

But, still report this error alert

image

image

It only only occur at pretrain train process, also in FT process, and after FT, the model should not be used.

Myscrpt:

!/bin/bash

deepspeed llava/train/train_mem.py \ --deepspeed ./scripts/zero2.json \ --model_name_or_path /data2/phi3-instruct/ \ --version plain \ --data_path ./playground/data/blip_laion_cc_sbu_558k.json \ --image_folder ./playground/data/images \ --vision_tower openai/clip-vit-large-patch14-336 \ --mm_projector_type mlp2x_gelu \ --tune_mm_mlp_adapter True \ --mm_vision_select_layer -2 \ --mm_use_im_start_end False \ --mm_use_im_patch_token False \ --bf16 True \ --output_dir ./checkpoints/llava-v1.5-phi3-mini-pretrain_v2 \ --num_train_epochs 1 \ --per_device_train_batch_size 32 \ --per_device_eval_batch_size 4 \ --gradient_accumulation_steps 1 \ --evaluation_strategy "no" \ --save_strategy "steps" \ --save_steps 24000 \ --save_total_limit 1 \ --learning_rate 1e-3 \ --weight_decay 0. \ --warmup_ratio 0.03 \ --lr_scheduler_type "cosine" \ --logging_steps 1 \ --tf32 True \ --model_max_length 2048 \ --gradient_checkpointing True \ --dataloader_num_workers 4 \ --lazy_preprocess True \ --report_to "tensorboard"

hanoonaR commented 6 months ago

Hi @hellangleZ ,

Regarding the warnings and errors you're encountering:

1) Warnings about using a model of a different type (e.g., phi3 for llava_phi) can typically be ignored if the model is functioning as expected. ('You are using a model of type phi3 to instantiate a model of type llava_phi. This is not supported for all configurations of models and can yield errors.' and 'you should train this model on downstream task..')

2) Concerning the error with some weights not being initialized from the checkpoint: it appears there might be a mix-up in the model class being initialized. Please ensure that you are initializing the LlavaPhiForCausalLM instead of LlavaLlamaForCausalLM with the checkpoint microsoft/Phi-3-mini-4k-instruct. The __init__.py should be set correctly to import LlavaPhiForCausalLM. You can do this by executing:

cp Phi-3-V/main__init__.py LLaVA/llava/__init__.py

Please double-check these details, and if the issue persists, we can investigate further.

Luo-Z13 commented 6 months ago

Hello @hanoonaR , there are tokenization mismatch warnings when finetuning LLaMA3-V:

WARNING: tokenization mismatch: 240 vs. 243. (ignored)
WARNING: tokenization mismatch: 452 vs. 455. (ignored)
WARNING: tokenization mismatch: 439 vs. 442. (ignored)

with transformers==4.41.0.dev0 and tokenizers==0.19.1, is this issue related to the tokenizers version like https://github.com/haotian-liu/LLaVA/issues/661 ?

hellangleZ commented 6 months ago

Hi @hellangleZ ,

Regarding the warnings and errors you're encountering:

  1. Warnings about using a model of a different type (e.g., phi3 for llava_phi) can typically be ignored if the model is functioning as expected. ('You are using a model of type phi3 to instantiate a model of type llava_phi. This is not supported for all configurations of models and can yield errors.' and 'you should train this model on downstream task..')
  2. Concerning the error with some weights not being initialized from the checkpoint: it appears there might be a mix-up in the model class being initialized. Please ensure that you are initializing the LlavaPhiForCausalLM instead of LlavaLlamaForCausalLM with the checkpoint microsoft/Phi-3-mini-4k-instruct. The __init__.py should be set correctly to import LlavaPhiForCausalLM. You can do this by executing:

cp Phi-3-V/main__init__.py LLaVA/llava/__init__.py

Please double-check these details, and if the issue persists, we can investigate further.

Hi @hellangleZ ,

Regarding the warnings and errors you're encountering:

  1. Warnings about using a model of a different type (e.g., phi3 for llava_phi) can typically be ignored if the model is functioning as expected. ('You are using a model of type phi3 to instantiate a model of type llava_phi. This is not supported for all configurations of models and can yield errors.' and 'you should train this model on downstream task..')
  2. Concerning the error with some weights not being initialized from the checkpoint: it appears there might be a mix-up in the model class being initialized. Please ensure that you are initializing the LlavaPhiForCausalLM instead of LlavaLlamaForCausalLM with the checkpoint microsoft/Phi-3-mini-4k-instruct. The __init__.py should be set correctly to import LlavaPhiForCausalLM. You can do this by executing:

cp Phi-3-V/main__init__.py LLaVA/llava/__init__.py

Please double-check these details, and if the issue persists, we can investigate further.

Hi @hanoonaR

Still like this image

And Actually my init is already support LLavaphi

1714474591059

hanoonaR commented 6 months ago

Hi @Luo-Z13 ,

Based on your description, it sounds like there might be a configuration issue with the tokenizer or model.

Here are a few steps to resolve the issue:

1) Ensure that you are using the correct conversation template, specifically conv_llama3 by setting --verion=llama3, as mismatches in templates can lead to tokenization issues. 2) Confirm that you are using the correct model meta-llama/Meta-Llama-3-8B-Instruct for this particular task. Using an incompatible model can also result in tokenization mismatches. 3) Could you please verify if setting your tokenizers - 0.19.1 and and transformers - 4.41.0.dev0 libraries specifically to versions respectively, resolves the issue? Sometimes, even minor version differences can cause unexpected behavior.

Please try these suggestions and let us know if the problem persists.

hanoonaR commented 6 months ago

@hellangleZ

Can you please show the error message - not just the parameter names, please. What does the error message say in "Some weights of ..."

hanoonaR commented 6 months ago

@hellangleZ ,

Can you try and see if this solves the issue?

cd LLaVA
export PYTHONPATH="./:$PYTHONPATH"
hellangleZ commented 6 months ago

@hellangleZ

Can you please show the error message - not just the parameter names, please. What does the error message say in "Some weights of ..."

Hi @hanoonaR the snapshot is the errpr message , and the log is full error image

You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:00<00:00, 2.09it/s] Some weights of LlavaLlamaForCausalLM were not initialized from the model checkpoint at /data2/phi3-instruct/ and are newly initialized: ['model.layers.0.mlp.gate_proj.weight', 'model.layers.0.mlp.up_proj.weight', 'model.layers.0.self_attn.k_proj.weight', 'model.layers.0.self_attn.q_proj.weight', 'model.layers.0.self_attn.v_proj.weight', 'model.layers.1.mlp.gate_proj.weight', 'model.layers.1.mlp.up_proj.weight', 'model.layers.1.self_attn.k_proj.weight', 'model.layers.1.self_attn.q_proj.weight', 'model.layers.1.self_attn.v_proj.weight', 'model.layers.10.mlp.gate_proj.weight', 'model.layers.10.mlp.up_proj.weight', 'model.layers.10.self_attn.k_proj.weight', 'model.layers.10.self_attn.q_proj.weight', 'model.layers.10.self_attn.v_proj.weight', 'model.layers.11.mlp.gate_proj.weight', 'model.layers.11.mlp.up_proj.weight', 'model.layers.11.self_attn.k_proj.weight', 'model.layers.11.self_attn.q_proj.weight', 'model.layers.11.self_attn.v_proj.weight', 'model.layers.12.mlp.gate_proj.weight', 'model.layers.12.mlp.up_proj.weight', 'model.layers.12.self_attn.k_proj.weight', 'model.layers.12.self_attn.q_proj.weight', 'model.layers.12.self_attn.v_proj.weight', 'model.layers.13.mlp.gate_proj.weight', 'model.layers.13.mlp.up_proj.weight', 'model.layers.13.self_attn.k_proj.weight', 'model.layers.13.self_attn.q_proj.weight', 'model.layers.13.self_attn.v_proj.weight', 'model.layers.14.mlp.gate_proj.weight', 'model.layers.14.mlp.up_proj.weight', 'model.layers.14.self_attn.k_proj.weight', 'model.layers.14.self_attn.q_proj.weight', 'model.layers.14.self_attn.v_proj.weight', 'model.layers.15.mlp.gate_proj.weight', 'model.layers.15.mlp.up_proj.weight', 'model.layers.15.self_attn.k_proj.weight', 'model.layers.15.self_attn.q_proj.weight', 'model.layers.15.self_attn.v_proj.weight', 'model.layers.16.mlp.gate_proj.weight', 'model.layers.16.mlp.up_proj.weight', 'model.layers.16.self_attn.k_proj.weight', 'model.layers.16.self_attn.q_proj.weight', 'model.layers.16.self_attn.v_proj.weight', 'model.layers.17.mlp.gate_proj.weight', 'model.layers.17.mlp.up_proj.weight', 'model.layers.17.self_attn.k_proj.weight', 'model.layers.17.self_attn.q_proj.weight', 'model.layers.17.self_attn.v_proj.weight', 'model.layers.18.mlp.gate_proj.weight', 'model.layers.18.mlp.up_proj.weight', 'model.layers.18.self_attn.k_proj.weight', 'model.layers.18.self_attn.q_proj.weight', 'model.layers.18.self_attn.v_proj.weight', 'model.layers.19.mlp.gate_proj.weight', 'model.layers.19.mlp.up_proj.weight', 'model.layers.19.self_attn.k_proj.weight', 'model.layers.19.self_attn.q_proj.weight', 'model.layers.19.self_attn.v_proj.weight', 'model.layers.2.mlp.gate_proj.weight', 'model.layers.2.mlp.up_proj.weight', 'model.layers.2.self_attn.k_proj.weight', 'model.layers.2.self_attn.q_proj.weight', 'model.layers.2.self_attn.v_proj.weight', 'model.layers.20.mlp.gate_proj.weight', 'model.layers.20.mlp.up_proj.weight', 'model.layers.20.self_attn.k_proj.weight', 'model.layers.20.self_attn.q_proj.weight', 'model.layers.20.self_attn.v_proj.weight', 'model.layers.21.mlp.gate_proj.weight', 'model.layers.21.mlp.up_proj.weight', 'model.layers.21.self_attn.k_proj.weight', 'model.layers.21.self_attn.q_proj.weight', 'model.layers.21.self_attn.v_proj.weight', 'model.layers.22.mlp.gate_proj.weight', 'model.layers.22.mlp.up_proj.weight', 'model.layers.22.self_attn.k_proj.weight', 'model.layers.22.self_attn.q_proj.weight', 'model.layers.22.self_attn.v_proj.weight', 'model.layers.23.mlp.gate_proj.weight', 'model.layers.23.mlp.up_proj.weight', 'model.layers.23.self_attn.k_proj.weight', 'model.layers.23.self_attn.q_proj.weight', 'model.layers.23.self_attn.v_proj.weight', 'model.layers.24.mlp.gate_proj.weight', 'model.layers.24.mlp.up_proj.weight', 'model.layers.24.self_attn.k_proj.weight', 'model.layers.24.self_attn.q_proj.weight', 'model.layers.24.self_attn.v_proj.weight', 'model.layers.25.mlp.gate_proj.weight', 'model.layers.25.mlp.up_proj.weight', 'model.layers.25.self_attn.k_proj.weight', 'model.layers.25.self_attn.q_proj.weight', 'model.layers.25.self_attn.v_proj.weight', 'model.layers.26.mlp.gate_proj.weight', 'model.layers.26.mlp.up_proj.weight', 'model.layers.26.self_attn.k_proj.weight', 'model.layers.26.self_attn.q_proj.weight', 'model.layers.26.self_attn.v_proj.weight', 'model.layers.27.mlp.gate_proj.weight', 'model.layers.27.mlp.up_proj.weight', 'model.layers.27.self_attn.k_proj.weight', 'model.layers.27.self_attn.q_proj.weight', 'model.layers.27.self_attn.v_proj.weight', 'model.layers.28.mlp.gate_proj.weight', 'model.layers.28.mlp.up_proj.weight', 'model.layers.28.self_attn.k_proj.weight', 'model.layers.28.self_attn.q_proj.weight', 'model.layers.28.self_attn.v_proj.weight', 'model.layers.29.mlp.gate_proj.weight', 'model.layers.29.mlp.up_proj.weight', 'model.layers.29.self_attn.k_proj.weight', 'model.layers.29.self_attn.q_proj.weight', 'model.layers.29.self_attn.v_proj.weight', 'model.layers.3.mlp.gate_proj.weight', 'model.layers.3.mlp.up_proj.weight', 'model.layers.3.self_attn.k_proj.weight', 'model.layers.3.self_attn.q_proj.weight', 'model.layers.3.self_attn.v_proj.weight', 'model.layers.30.mlp.gate_proj.weight', 'model.layers.30.mlp.up_proj.weight', 'model.layers.30.self_attn.k_proj.weight', 'model.layers.30.self_attn.q_proj.weight', 'model.layers.30.self_attn.v_proj.weight', 'model.layers.31.mlp.gate_proj.weight', 'model.layers.31.mlp.up_proj.weight', 'model.layers.31.self_attn.k_proj.weight', 'model.layers.31.self_attn.q_proj.weight', 'model.layers.31.self_attn.v_proj.weight', 'model.layers.4.mlp.gate_proj.weight', 'model.layers.4.mlp.up_proj.weight', 'model.layers.4.self_attn.k_proj.weight', 'model.layers.4.self_attn.q_proj.weight', 'model.layers.4.self_attn.v_proj.weight', 'model.layers.5.mlp.gate_proj.weight', 'model.layers.5.mlp.up_proj.weight', 'model.layers.5.self_attn.k_proj.weight', 'model.layers.5.self_attn.q_proj.weight', 'model.layers.5.self_attn.v_proj.weight', 'model.layers.6.mlp.gate_proj.weight', 'model.layers.6.mlp.up_proj.weight', 'model.layers.6.self_attn.k_proj.weight', 'model.layers.6.self_attn.q_proj.weight', 'model.layers.6.self_attn.v_proj.weight', 'model.layers.7.mlp.gate_proj.weight', 'model.layers.7.mlp.up_proj.weight', 'model.layers.7.self_attn.k_proj.weight', 'model.layers.7.self_attn.q_proj.weight', 'model.layers.7.self_attn.v_proj.weight', 'model.layers.8.mlp.gate_proj.weight', 'model.layers.8.mlp.up_proj.weight', 'model.layers.8.self_attn.k_proj.weight', 'model.layers.8.self_attn.q_proj.weight', 'model.layers.8.self_attn.v_proj.weight', 'model.layers.9.mlp.gate_proj.weight', 'model.layers.9.mlp.up_proj.weight', 'model.layers.9.self_attn.k_proj.weight', 'model.layers.9.self_attn.q_proj.weight', 'model.layers.9.self_attn.v_proj.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. /data22/llava/lib/python3.10/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()

hellangleZ commented 6 months ago

@hellangleZ ,

Can you try and see if this solves the issue?

cd LLaVA
export PYTHONPATH="./:$PYTHONPATH"

After check the environment variable command , nothing happend

image

hellangleZ commented 6 months ago

@hellangleZ ,

Can you try and see if this solves the issue?

cd LLaVA
export PYTHONPATH="./:$PYTHONPATH"

I can share my enviroment python 3.10.14 torch 2.1.2 torchvision 0.16.2

ninja 1.11.1.1 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-ml-py 12.535.108 nvidia-nccl-cu12 2.18.1 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.1.105

deepspeed 0.14.2

transformers 4.41.0.dev0 triton 2.1.0

tokenizers 0.19.1

I think above is related to the project

hanoonaR commented 6 months ago

Hi @hellangleZ ,

Please run the following command:

cd LLaVA
export PYTHONPATH="./:$PYTHONPATH"

Then run bash LLaMA3-V_pretrain.sh or your training command.


If this does not solve the problem - Please provide the detailed steps you have followed to run the training (from cloning till running the script). Thank you.

hellangleZ commented 6 months ago

It should be the LLava still touch the llava_llama and LlavallamaforCausalLM the two method, but after me try to find and change the source code to let it find the PhiforCausualLM , it still not work.

hellangleZ commented 6 months ago

Hi @hellangleZ ,

Please run the following command:

cd LLaVA
export PYTHONPATH="./:$PYTHONPATH"

Then run bash LLaMA3-V_pretrain.sh or your training command.

If this does not solve the problem - Please provide the detailed steps you have followed to run the training (from cloning till running the script). Thank you.

Still same

1- git clone https://github.com/mbzuai-oryx/LLaVA-pp.git cd LLaVA-pp git submodule update --init --recursive 2- pip install git+https://github.com/huggingface/transformers@a98c41798cf6ed99e1ff17e3792d6e06a2ff2ff3 3-

cp Phi-3-V/train.py LLaVA/llava/train/train.py cp Phi-3-V/llava_phi3.py LLaVA/llava/model/language_model/llava_phi3.py cp Phi-3-V/builder.py LLaVA/llava/model/builder.py cp Phi-3-V/modelinit.py LLaVA/llava/model/init.py cp Phi-3-V/maininit.py LLaVA/llava/init.py cp Phi-3-V/conversation.py LLaVA/llava/conversation.py

cp scripts/Phi3-V_pretrain.sh LLaVA/Vi-phi3_pretrain.sh cp scripts/Phi3-V_finetune_lora.sh LLaVA/Vi-phi3_finetune_lora.sh

4- image

All the step follow the step by step which provide by the project

Luo-Z13 commented 6 months ago

Hi @Luo-Z13 ,

Based on your description, it sounds like there might be a configuration issue with the tokenizer or model.

Here are a few steps to resolve the issue:

  1. Ensure that you are using the correct conversation template, specifically conv_llama3 by setting --verion=llama3, as mismatches in templates can lead to tokenization issues.
  2. Confirm that you are using the correct model meta-llama/Meta-Llama-3-8B-Instruct for this particular task. Using an incompatible model can also result in tokenization mismatches.
  3. Could you please verify if setting your tokenizers - 0.19.1 and and transformers - 4.41.0.dev0 libraries specifically to versions respectively, resolves the issue? Sometimes, even minor version differences can cause unexpected behavior.

Please try these suggestions and let us know if the problem persists.

Thank you for your suggestions, for the correct conversation template, I keep --version llama3 in the script, do I need to change the conversation in the instruction-tuning dataset? Now I keep the LLaVA-Instruction format as:

[ { "from": "human", 
"value": "<image>\n...k?" }, 
{ "from": "gpt", "value": "To ...." } ]

Do I need to change this format?

hellangleZ commented 6 months ago

@hellangleZ ,

Can you try and see if this solves the issue?

cd LLaVA
export PYTHONPATH="./:$PYTHONPATH"

@hanoonaR

Could you all someone could help track the new code , like train.py

I saw the method train still choose the LlavaLLamaforCausalLM, Am I right?

image

hanoonaR commented 6 months ago

Hi @hellangleZ ,

You are right. We have just made a commit to fix the issue. Please check this. Thank you for bringing this to our attention. Apologies for the inconvenience and oversight.

hanoonaR commented 6 months ago

Hi @Luo-Z13 , Based on your description, it sounds like there might be a configuration issue with the tokenizer or model. Here are a few steps to resolve the issue:

  1. Ensure that you are using the correct conversation template, specifically conv_llama3 by setting --verion=llama3, as mismatches in templates can lead to tokenization issues.
  2. Confirm that you are using the correct model meta-llama/Meta-Llama-3-8B-Instruct for this particular task. Using an incompatible model can also result in tokenization mismatches.
  3. Could you please verify if setting your tokenizers - 0.19.1 and and transformers - 4.41.0.dev0 libraries specifically to versions respectively, resolves the issue? Sometimes, even minor version differences can cause unexpected behavior.

Please try these suggestions and let us know if the problem persists.

Thank you for your suggestions, for the correct conversation template, I keep --version llama3 in the script, do I need to change the conversation in the instruction-tuning dataset? Now I keep the LLaVA-Instruction format as:

[ { "from": "human", 
"value": "<image>\n...k?" }, 
{ "from": "gpt", "value": "To ...." } ]

Do I need to change this format?

No, you do not need to change the format of the instructions - Using the --version llama3 is sufficient. Please let us know if this solves, or the transformer and tokenizer version solves this issue.

hellangleZ commented 6 months ago

Hi @hellangleZ ,

You are right. We have just made a commit to fix the issue. Please check this. Thank you for bringing this to our attention. Apologies for the inconvenience and oversight.

Hi, @hanoonaR

only change it ,could not fulfill yet, I just test again, it maybe something python file, still use the old method, but I can't find it, please still help to check and fix

Thanks

Luo-Z13 commented 6 months ago

Hi @Luo-Z13 , Based on your description, it sounds like there might be a configuration issue with the tokenizer or model. Here are a few steps to resolve the issue:

  1. Ensure that you are using the correct conversation template, specifically conv_llama3 by setting --verion=llama3, as mismatches in templates can lead to tokenization issues.
  2. Confirm that you are using the correct model meta-llama/Meta-Llama-3-8B-Instruct for this particular task. Using an incompatible model can also result in tokenization mismatches.
  3. Could you please verify if setting your tokenizers - 0.19.1 and and transformers - 4.41.0.dev0 libraries specifically to versions respectively, resolves the issue? Sometimes, even minor version differences can cause unexpected behavior.

Please try these suggestions and let us know if the problem persists.

Thank you for your suggestions, for the correct conversation template, I keep --version llama3 in the script, do I need to change the conversation in the instruction-tuning dataset? Now I keep the LLaVA-Instruction format as:

[ { "from": "human", 
"value": "<image>\n...k?" }, 
{ "from": "gpt", "value": "To ...." } ]

Do I need to change this format?

No, you do not need to change the format of the instructions - Using the --version llama3 is sufficient. Please let us know if this solves, or the transformer and tokenizer version solves this issue.

Ok, it seems that adjusting the versions of the transformer and tokenizer doesn't solve this issue. And there were warnings when I installed them as follows:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llava 1.2.2.post1 requires accelerate==0.21.0, but you have accelerate 0.27.2 which is incompatible.
llava 1.2.2.post1 requires tokenizers==0.15.1, but you have tokenizers 0.19.1 which is incompatible.
llava 1.2.2.post1 requires transformers==4.37.2, but you have transformers 4.41.0.dev0 which is incompatible.
Luo-Z13 commented 6 months ago

Hi @Luo-Z13 , Based on your description, it sounds like there might be a configuration issue with the tokenizer or model. Here are a few steps to resolve the issue:

  1. Ensure that you are using the correct conversation template, specifically conv_llama3 by setting --verion=llama3, as mismatches in templates can lead to tokenization issues.
  2. Confirm that you are using the correct model meta-llama/Meta-Llama-3-8B-Instruct for this particular task. Using an incompatible model can also result in tokenization mismatches.
  3. Could you please verify if setting your tokenizers - 0.19.1 and and transformers - 4.41.0.dev0 libraries specifically to versions respectively, resolves the issue? Sometimes, even minor version differences can cause unexpected behavior.

Please try these suggestions and let us know if the problem persists.

Thank you for your suggestions, for the correct conversation template, I keep --version llama3 in the script, do I need to change the conversation in the instruction-tuning dataset? Now I keep the LLaVA-Instruction format as:

[ { "from": "human", 
"value": "<image>\n...k?" }, 
{ "from": "gpt", "value": "To ...." } ]

Do I need to change this format?

No, you do not need to change the format of the instructions - Using the --version llama3 is sufficient. Please let us know if this solves, or the transformer and tokenizer version solves this issue.

Ok, it seems that adjusting the versions of the transformer and tokenizer doesn't solve this issue. And there were warnings when I installed them as follows:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llava 1.2.2.post1 requires accelerate==0.21.0, but you have accelerate 0.27.2 which is incompatible.
llava 1.2.2.post1 requires tokenizers==0.15.1, but you have tokenizers 0.19.1 which is incompatible.
llava 1.2.2.post1 requires transformers==4.37.2, but you have transformers 4.41.0.dev0 which is incompatible.

Would setting use_fast=True solve this issue? Like https://github.com/haotian-liu/LLaVA/issues/661#issuecomment-1798034258

mmaaz60 commented 6 months ago

Hi Everyone,

Please refer to the comment at https://github.com/mbzuai-oryx/LLaVA-pp/issues/8#issuecomment-2085863255. This will be helpful. Thank You and Good Luck !

hellangleZ commented 6 months ago

Hi Everyone,

Please refer to the comment at #8 (comment). This will be helpful. Thank You and Good Luck !

Hi @mmaaz60 and @hanoonaR

Thanks for support

It's working good now