dvlab-research / LLaMA-VID

Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
Apache License 2.0
623 stars 40 forks source link

inference error #27

Closed liziming5353 closed 5 months ago

liziming5353 commented 6 months ago

When loading the provided 13b-full model for evaluation, an error is reported: [2023-12-24 23:39:32,495] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect) Traceback (most recent call last): File "llamavid/model_msvd_qa.py", line 161, in run_inference(args) File "llamavid/model_msvd_qa.py", line 69, in run_inference tokenizer, model, image_processor, context_len = load_pretrained_model(args.model_path, args.model_base, model_name, args.model_max_length) File "llamavid/llamavid/model/builder.py", line 56, in load_pretrained_model tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False) File "/home/miniconda3/envs/llamavid/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 718, in from_pretrained tokenizer_class_py, tokenizer_class_fast = TOKENIZER_MAPPING[type(config)] File "/home/miniconda3/envs/llamavid/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 663, in getitem model_type = self._reverse_config_mapping[key.name] KeyError: 'LlavaConfig' How to solve it?

yanwei-li commented 6 months ago

Hi, I tried the 13B model but did not find any error, could you please provide more details, like the command and used model for inference?

liziming5353 commented 6 months ago

for IDX in $(seq 0 $((CHUNKS-1))); do CUDA_VISIBLE_DEVICES=${GPULIST[$IDX]} nohup python model_msvd_qa.py \ --model-path work_dirs/llama-vid-13b-full-224-video-fps-1 \ --video_dir eval_datasets/msvd/videos \ --gt_file eval_datasets/msvd/test_qa.json \ --output_dir ./results \ --output_name msvd_office_13b \ --num-chunks $CHUNKS \ --chunk-idx $IDX \ --conv-mode vicuna_v1 > ./logs/msvd_office13b${IDX} 2>&1 &

done

yanwei-li commented 6 months ago

I tried the inference code and it runs well. Is the transformers==4.31.0 in your environment?