hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All
https://hpcaitech.github.io/Open-Sora/
Apache License 2.0
21.64k stars 2.09k forks source link

llava image captioning issue - llava model build failed #671

Closed BountyMage closed 1 month ago

BountyMage commented 1 month ago

Describe the issue

Issue: I was doing the image captioning via llava following data processing guide, facing the "AttributeError: 'NoneType' object has no attribute 'is_loaded' " issue. Command:

torchrun --nproc_per_node 1 --standalone -m tools.caption.caption_llava xxx/dataset/image.csv --tp-size 1 --dp-size 1 --bs 16 --prompt image-3ex --model-path xxx/llava-v1.6-mistral-7b

Log:

Traceback (most recent call last):
  File "/opt/conda/envs/opensora-llava/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/envs/opensora-llava/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/conda/envs/opensora-llava/lib/python3.10/site-packages/tools/caption/caption_llava.py", line 345, in <module>
    main(args)
  File "/opt/conda/envs/opensora-llava/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/conda/envs/opensora-llava/lib/python3.10/site-packages/tools/caption/caption_llava.py", line 84, in main
    tokenizer, model, image_processor, context_len = load_pretrained_model(
  File "/data/workspace_zj/code/LLaVA-main/llava/model/builder.py", line 156, in load_pretrained_model
    if not vision_tower.is_loaded:
AttributeError: 'NoneType' object has no attribute 'is_loaded'

Screenshots: You may attach screenshots if it better explains the issue.

BountyMage commented 1 month ago

from a similar issue on llava project, I found there is no vision_tower related config items in the original model config file. How do get around this problem? https://github.com/haotian-liu/LLaVA/issues/15

BountyMage commented 1 month ago

downloaded wrong model weight downloaded https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf should download https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b