Closed jaysunl closed 3 months ago
Hi, I run your command but don't see the issue if I just ask a question.
mkdir UserData ; docker run --runtime=nvidia -v /etc/passwd:/etc/passwd:ro -v /etc/group:/etc/group:ro -u 23764:60856 -v UserData:/workspace/UserData -v user_path:/workspace/user_path gcr.io/vorvan/h2oai/h2ogpt-runtime:0.2.1 /workspace/generate.py --base_model=HuggingFaceH4/zephyr-7b-beta --pre_load_embedding_model=True --embedding_gpu_id=cpu --cut_distance=10000 --hf_embedding_model=BAAI/bge-base-en-v1.5 --score_model=None --enable_tts=False --enable_stt=False --enable_transcriptions=False --max_seq_len=2048 --chunk_size=128 --top_k_docs=3 --langchain_mode=UserData --load_4bit=True --share=True --use_gpu_id=0 --user_path=/workspace/user_path
I changed minor things.
I do see that error in a specific case of changing the model in load models, is that what you were doing instead?
I received this error at first when doing HuggingFaceH4/zephyr-7b-beta. So I tried TheBloke/Llama-2-7B-GGUF on the h2ogpt list of models using load model and it gave the error still. I will try again and see if I get the same error
Got it, so yes you are clicking "load model" not just asking question. So the above will fix. I'll rebuild the docker image.
I merged a PR before exhaustive nightly testing after spot checks, but this error was caught in the nightly.
Thanks.
Hello, I seemed to have pulled the latest repo version but when I go to use a model, I get the following error:
Does this have something to do with the latest change made (looking at the past PRs)? I have not received this error until now. I am running the following command:
Also I have tried loading in a different model but it still shows this error.