Multimodal Chat fails in AutoConfig.from_pretrained(model_path)

bryanhughes commented 7 months ago

Running the example:

./run.sh $(./autotag local_llm) \
  python3 -m local_llm --api=mlc --model=liuhaotian/llava-v1.5-13b \
    --prompt '/data/images/fruit.jpg' \
    --prompt 'what kind of fruits do you see?' \
    --prompt 'reset' \
    --prompt '/data/images/dogs.jpg' \
    --prompt 'what breed of dogs are in the image?' \
    --prompt 'reset' \
    --prompt '/data/images/path.jpg' \
    --prompt 'what does the sign say?'

Seems to successfully pull the image but fails...

23:39:16 | INFO | loading /data/models/huggingface/models--liuhaotian--llava-v1.5-13b/snapshots/d64eb781be6876a5facc160ab1899281f59ef684 with MLC0:07, 46.3MB/s]
Traceback (most recent call last):100%|████████████████████████████████████████████████████████████████████████████████████| 9.95G/9.95G [04:37<00:00, 66.3MB/s]
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/local_llm/local_llm/__main__.py", line 22, in <module>
    model = LocalLM.from_pretrained(
  File "/opt/local_llm/local_llm/local_llm.py", line 72, in from_pretrained
    model = MLCModel(model_path, **kwargs)
  File "/opt/local_llm/local_llm/models/mlc.py", line 43, in __init__
    super(MLCModel, self).__init__(model_path, **kwargs)
  File "/opt/local_llm/local_llm/local_llm.py", line 147, in __init__
    self.model_config = AutoConfig.from_pretrained(model_path)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1064, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 761, in __getitem__
    raise KeyError(key)
KeyError: 'llava'

dusty-nv commented 7 months ago

Hi @bryanhughes, this was fixed in commit https://github.com/dusty-nv/jetson-containers/commit/b9163b196cf41125c2f67b9964c4fcfccad2691c and I recently rebuilt all the local_llm container tags, so if you do a sudo docker pull $(./autotag local_llm), it should update your container with this fix in it.

bryanhughes commented 7 months ago

Hi @dusty-nv, unfortunately this is still broken for me.

$ sudo docker pull $(./autotag local_llm)
Namespace(packages=['local_llm'], prefer=['local', 'registry', 'build'], disable=[''], user='dustynv', output='/tmp/autotag', quiet=False, verbose=False)
-- L4T_VERSION=36.2.0  JETPACK_VERSION=6.0  CUDA_VERSION=12.2.140
-- Finding compatible container image for ['local_llm']
dustynv/local_llm:r36.2.0
r36.2.0: Pulling from dustynv/local_llm
Digest: sha256:7ce9384b20d5aa8dea9557fa4841b2e684e225f2082fff6fbc80753b8020876f
Status: Image is up to date for dustynv/local_llm:r36.2.0
docker.io/dustynv/local_llm:r36.2.0

Still getting the KeyError failure:

23:30:44 | INFO | loading /data/models/huggingface/models--liuhaotian--llava-v1.5-7b/snapshots/12e054b30e8e061f423c7264bc97d4248232e965 with MLC
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/local_llm/local_llm/agents/video_query.py", line 115, in <module>
    agent = VideoQuery(**vars(args)).run() 
  File "/opt/local_llm/local_llm/agents/video_query.py", line 22, in __init__
    self.llm = ProcessProxy((lambda **kwargs: ChatQuery(model, drop_inputs=True, **kwargs)), **kwargs)
  File "/opt/local_llm/local_llm/plugins/process_proxy.py", line 31, in __init__
    raise RuntimeError(f"subprocess has an invalid initialization status ({init_msg['status']})")
RuntimeError: subprocess has an invalid initialization status (<class 'KeyError'>)
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/local_llm/local_llm/plugins/process_proxy.py", line 62, in run_process
    raise error
  File "/opt/local_llm/local_llm/plugins/process_proxy.py", line 59, in run_process
    plugin = factory(**kwargs)
  File "/opt/local_llm/local_llm/agents/video_query.py", line 22, in <lambda>
    self.llm = ProcessProxy((lambda **kwargs: ChatQuery(model, drop_inputs=True, **kwargs)), **kwargs)
  File "/opt/local_llm/local_llm/plugins/chat_query.py", line 63, in __init__
    self.model = LocalLM.from_pretrained(model, **kwargs)
  File "/opt/local_llm/local_llm/local_llm.py", line 72, in from_pretrained
    model = MLCModel(model_path, **kwargs)
  File "/opt/local_llm/local_llm/models/mlc.py", line 43, in __init__
    super(MLCModel, self).__init__(model_path, **kwargs)
  File "/opt/local_llm/local_llm/local_llm.py", line 147, in __init__
    self.model_config = AutoConfig.from_pretrained(model_path)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1064, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 761, in __getitem__
    raise KeyError(key)
KeyError: 'llava'

You mentioned that I could also change the config.json file, which I am not 100% clear on where to find it. I searched through the repo and found that it seems to be linked to Huggingface?

dusty-nv commented 7 months ago

@bryanhughes thanks for reporting that it still isn't working - can you do sudo docker images | grep local_llm ? the image should be from 2/16 (but I will probably try reproducing it again and may end up pushing an updated image again)

Patch the model's config.json that was downloaded under data/models/huggingface/models--liuhaotian--llava-v1.5-13b/snapshots/*

modify "model_type": "llava", to "model_type": "llama", Then re-run the command above - the quantization tools will then treat it like a Llama model (which it is)

dusty-nv commented 7 months ago

OK, this should be fixed again in https://github.com/dusty-nv/jetson-containers/commit/8fc4f936b3057e1a8c96f0225d1ef317f5ad49d0 and I pushed the rebuilt dustynv/local_llm:r36.2.0 if you can pull that again

dusty-nv / jetson-containers

Multimodal Chat fails in AutoConfig.from_pretrained(model_path) #393