Closed rohitnanda1443 closed 5 days ago
As in the instructions, the "base_model" has to be the same name as ollama was set to.
i.e.
python generate.py --guest_name='' --base_model=mistral:v0.3 --max_seq_len=8094 --enable_tts=False --enable_stt=False --enable_transcriptions=False --use_gpu_id=False --inference_server=vllm_chat:http://localhost:11434/v1/ --prompt_type=openai_chat
and ignore errors about not finding the tokenizer etc.
For more accurate tokenization specify the tokenizer and hf token (because mistralai is gated on HF):
python generate.py --guest_name='' --base_model=mistral:v0.3 --tokenizer_base_model=mistralai/Mistral-7B-Instruct-v0.3 --max_seq_len=8094 --enable_tts=False --enable_stt=False --enable_transcriptions=False --use_gpu_id=False --inference_server=vllm_chat:http://localhost:11434/v1/ --prompt_type=openai_chat --use_auth_token=<token>
Hi All,
I am trying to run an inference server on Ollama using the below script:
ollama run mistral:v0.3
Then running h2o-gpt using the below script:
python generate.py --guest_name='' --base_model=mistralai/Mistral-7B-Instruct-v0.3 --max_seq_len=8094 --enable_tts=False --enable_stt=False --enable_transcriptions=False --use_gpu_id=False --inference_server=vllm_chat:http://localhost:11434/v1/ --prompt_type=openai_chat &
Issue I am facing is that model not found in the H2o UI. What is the correct model name for the Ollama mistral-0.3 model to be passed in the CLI for H20?