Open PieBru opened 1 day ago
@PieBru LLM_CONFIG_OLLAMA = { "llm_type": "ollama", "base_url": "http://10.4.0.100:33821", # Ensure this URL is correct "model_name": "research-phi:latest", # Updated model name "temperature": 0.7, "top_p": 0.9, "n_ctx": 2048, # Adjust as needed "context_length": 2048, # Adjust as needed "stop": ["User:", "\n\n"] }
Hello, have you tried updating your model name correctly? Does it work with the proper name?
I tested with Phi3 not 3.5, but yeah as the other helpful person pointed out you need to change your model name in the llm_config.py file for it to work to the one you intend to use with Ollama, also you need to make a custom model as I layed out in the github instructions if you want it to have enough context to work properly!
I created the custom model with ollama create research-phi3 -f MODELFILE
, where MODELFILE
content is:
FROM phi3:3.8b-mini-128k-instruct-q6_K
PARAMETER num_ctx 38000
As shown on my screenshot, the phi3:3.8b-mini-128k-instruct-q6_K
is available to the WebUI, thus is runnable by Ollama.
@hafeezhmha I used "model_name": "research-phi3"
that should be legit for Ollama. No difference using "model_name": "research-phi3:latest"
. I never used or created "research-phi:latest"
without the version number.
This is the phi
models available on my Ollama server:
ollama list | grep phi
research-phi3:latest 283806c9c3e0 3.1 GB 41 hours ago
phi3:3.8b-mini-128k-instruct-q6_K 90771235c599 3.1 GB 41 hours ago
phi3.5:3.8b-mini-instruct-q6_K 64777e5c6803 3.1 GB 2 days ago
Please suggest any test I can do to narrow down this blocking problem.
As the context length definition is a bit confusing to me, I tried 38000
in the config as it was suggested for the MODELFILE
, but I'm still stuck.
# LLM settings for Ollama
LLM_CONFIG_OLLAMA = {
"llm_type": "ollama",
"base_url": "http://10.4.0.100:33821", # default Ollama server URL
"model_name": "research-phi3:latest", # Replace with your Ollama model name
"temperature": 0.7,
"top_p": 0.9,
"n_ctx": 38000, #55000,
"num_ctx": 38000,
"context_length": 38000, #55000,
"stop": ["User:", "\n\n"]
}
At the program start, nvtop
on the Ollama server showed a little activity, but after a little while it terminates with the 500 error.
@hafeezhmha I tried with a wrong model name and the error is immediately the expected 404.
Hi, awesome project! I'm on the doorstep of my first query, but I'm stuck.
This is the Ollama server API endpoint:
This is the error:
And this is my
llm_config.py
:These are the phi3 models installed on this Ollama server, seen from the Open-WebUI:
Thank you, Piero