NVlabs / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Apache License 2.0
2.01k stars 161 forks source link

Evaluation of AWQ models #150

Open surya00060 opened 1 week ago

surya00060 commented 1 week ago

When I try to evaluate the quantized AWQ models using the video evalaution script, I'm getting FileNotFoundError.

FileNotFoundError: No such file or directory: "/hfhub/hub/models--Efficient-Large-Model--VILA1.5-3b-AWQ/snapshots/f18f59ccac0b45f92e70a490e6f88ab5ebadef23/llm/model-00001-of-00002.safetensors"

Is there any other way to run AWQ models? To get accuracy numbers.

Lyken17 commented 3 days ago

The checkpoint path was not set properlly, either point to a wrong path or not downloaded. Please attach more details

surya00060 commented 3 days ago

Thanks for replying back.

From the instruction found here The following command works. It downloads the model from huggingface and starts the video evaluation run (inference).

./scripts/v1_5/eval/video_chatgpt/run_all.sh Efficient-Large-Model/VILA1.5-3b VILA1.5-3b vicuna_v1

While I change the model to AWQ,

./scripts/v1_5/eval/video_chatgpt/run_all.sh Efficient-Large-Model/VILA1.5-3b-AWQ VILA1.5-3b-AWQ vicuna_v1

I get the following error

Fetching 16 files:   0%|          | 0/16 [00:00<?, ?it/s]
Fetching 16 files: 100%|██████████| 16/16 [00:00<00:00, 12082.98it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/depot/araghu/data/selvams/VILA/llava/eval/model_vqa_video.py", line 213, in <module>
    eval_model(args)
  File "/depot/araghu/data/selvams/VILA/llava/eval/model_vqa_video.py", line 126, in eval_model
    tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, model_name, args.model_base)
  File "/depot/araghu/data/selvams/VILA/llava/model/builder.py", line 151, in load_pretrained_model
    model = LlavaLlamaModel(config=config, low_cpu_mem_usage=True, **kwargs)
  File "/depot/araghu/data/selvams/VILA/llava/model/language_model/llava_llama.py", line 43, in __init__
    return self.init_vlm(config=config, *args, **kwargs)
  File "/depot/araghu/data/selvams/VILA/llava/model/llava_arch.py", line 76, in init_vlm
    self.llm, self.tokenizer = build_llm_and_tokenizer(llm_cfg, config, *args, **kwargs)
  File "/depot/araghu/data/selvams/VILA/llava/model/language_model/builder.py", line 71, in build_llm_and_tokenizer
    llm = AutoModelForCausalLM.from_pretrained(
  File "/depot/araghu/data/selvams/vila-env/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
    return model_class.from_pretrained(
  File "/depot/araghu/data/selvams/vila-env/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3697, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/depot/araghu/data/selvams/vila-env/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4080, in _load_pretrained_model
    state_dict = load_state_dict(shard_file)
  File "/depot/araghu/data/selvams/vila-env/lib/python3.10/site-packages/transformers/modeling_utils.py", line 497, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
FileNotFoundError: No such file or directory: "/scratch/gilbreth/selvams/hfhub/hub/models--Efficient-Large-Model--VILA1.5-3b-AWQ/snapshots/f18f59ccac0b45f92e70a490e6f88ab5ebadef23/llm/model-00001-of-00002.safetensors"
./scripts/v1_5/eval/video_chatgpt/run_qa_msrvtt.sh: line 44: runs/eval/VILA1.5-3b-AWQ/MSRVTT_Zero_Shot_QA/merge.jsonl: No such file or directory
./scripts/v1_5/eval/video_chatgpt/run_qa_msrvtt.sh: line 48: runs/eval/VILA1.5-3b-AWQ/MSRVTT_Zero_Shot_QA/merge.jsonl: No such file or directory

I observed the similar error with AWQ models when performing quick inference as well

Lyken17 commented 3 days ago

The support of AWQ is separated from the main repo, which means functions such as training and evaluation do not come with AWQ support -- you have to use BF16 precision