h2oai / h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
http://h2o.ai
Apache License 2.0
11.28k stars 1.24k forks source link

KeyError: images_num_max #1680

Closed jaysunl closed 3 months ago

jaysunl commented 3 months ago

Hello, I seemed to have pulled the latest repo version but when I go to use a model, I get the following error:

Traceback (most recent call last):
  File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/queueing.py", line 566, in process_events
    response = await route_utils.call_process_api(
  File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/route_utils.py", line 261, in call_process_api
    output = await app.get_blocks().process_api(
  File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/blocks.py", line 1788, in process_api
    result = await self.call_function(
  File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/blocks.py", line 1352, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/utils.py", line 595, in async_iteration
    return await iterator.__anext__()
  File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/utils.py", line 588, in __anext__
    return await anyio.to_thread.run_sync(
  File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
    return await future
  File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run
    result = context.run(func, *args)
  File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/utils.py", line 571, in run_sync_iterator_async
    return next(iterator)
  File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/utils.py", line 754, in gen_wrapper
    response = next(iterator)
  File "/workspace/src/gradio_funcs.py", line 983, in bot
    for res in get_response(fun1, history, chatbot_role1, speaker1, tts_language1, roles_state1,
  File "/workspace/src/gradio_funcs.py", line 521, in get_response
    yield from _get_response(fun1, history, chatbot_role1, speaker1, tts_language1, roles_state1, tts_speed1,
  File "/workspace/src/gradio_funcs.py", line 652, in _get_response
    for output_fun in fun1():
  File "/workspace/src/gen.py", line 4293, in evaluate
    images_num_max = images_num_max or chosen_model_state['images_num_max']
KeyError: 'images_num_max'

Does this have something to do with the latest change made (looking at the past PRs)? I have not received this error until now. I am running the following command:

docker run --runtime=nvidia -v /etc/passwd:/etc/passwd:ro -v /etc/group:/etc/group:ro -u 23764:60856 -v /lsc/scratch/automation/UserData:/workspace/UserData -v /lsc/scratch/automation/h2ogpt/user_path:/workspace/user_path h2ogpt:latest /workspace/generate.py --base_model=HuggingFaceH4/zephyr-7b-beta --pre_load_embedding_model=True --embedding_gpu_id=cpu --cut_distance=10000 --hf_embedding_model=BAAI/bge-base-en-v1.5 --score_model=None --enable_tts=False --enable_stt=False --enable_transcriptions=False --max_seq_len=2048 --chunk_size=128 --top_k_docs=3 --langchain_mode=UserData --load_4bit=True --share=True --use_gpu_id=0 --user_path=/workspace/user_path

Also I have tried loading in a different model but it still shows this error.

pseudotensor commented 3 months ago

Hi, I run your command but don't see the issue if I just ask a question.

mkdir UserData ; docker run --runtime=nvidia -v /etc/passwd:/etc/passwd:ro -v /etc/group:/etc/group:ro -u 23764:60856 -v UserData:/workspace/UserData -v user_path:/workspace/user_path gcr.io/vorvan/h2oai/h2ogpt-runtime:0.2.1 /workspace/generate.py --base_model=HuggingFaceH4/zephyr-7b-beta --pre_load_embedding_model=True --embedding_gpu_id=cpu --cut_distance=10000 --hf_embedding_model=BAAI/bge-base-en-v1.5 --score_model=None --enable_tts=False --enable_stt=False --enable_transcriptions=False --max_seq_len=2048 --chunk_size=128 --top_k_docs=3 --langchain_mode=UserData --load_4bit=True --share=True --use_gpu_id=0 --user_path=/workspace/user_path

I changed minor things.

I do see that error in a specific case of changing the model in load models, is that what you were doing instead?

jaysunl commented 3 months ago

I received this error at first when doing HuggingFaceH4/zephyr-7b-beta. So I tried TheBloke/Llama-2-7B-GGUF on the h2ogpt list of models using load model and it gave the error still. I will try again and see if I get the same error

pseudotensor commented 3 months ago

Got it, so yes you are clicking "load model" not just asking question. So the above will fix. I'll rebuild the docker image.

I merged a PR before exhaustive nightly testing after spot checks, but this error was caught in the nightly.

Thanks.