Error 500 from Ollama - Githubissues

PieBru commented 1 day ago

Hi, awesome project! I'm on the doorstep of my first query, but I'm stuck.

This is the Ollama server API endpoint:

curl http://10.4.0.100:33821/api/version
{"version":"0.4.2"}

This is the error:

And this is my llm_config.py:

# llm_config.py

LLM_TYPE = "ollama"  # Options: 'llama_cpp', 'ollama'

# LLM settings for llama_cpp
MODEL_PATH = "/home/james/llama.cpp/models/gemma-2-9b-it-Q6_K.gguf" # Replace with your llama.cpp models filepath

LLM_CONFIG_LLAMA_CPP = {
    "llm_type": "llama_cpp",
    "model_path": MODEL_PATH,
    "n_ctx": 20000,  # context size
    "n_gpu_layers": 0,  # number of layers to offload to GPU (-1 for all, 0 for none)
    "n_threads": 8,  # number of threads to use
    "temperature": 0.7,  # temperature for sampling
    "top_p": 0.9,  # top p for sampling
    "top_k": 40,  # top k for sampling
    "repeat_penalty": 1.1,  # repeat penalty
    "max_tokens": 1024,  # max tokens to generate
    "stop": ["User:", "\n\n"]  # stop sequences
}

# LLM settings for Ollama
LLM_CONFIG_OLLAMA = {
    "llm_type": "ollama",
    "base_url": "http://10.4.0.100:33821",  # default Ollama server URL
    "model_name": "research-phi3",  # Replace with your Ollama model name
    "temperature": 0.7,
    "top_p": 0.9,
    "n_ctx": 55000,
    "context_length": 55000,
    "stop": ["User:", "\n\n"]
}

def get_llm_config():
    if LLM_TYPE == "llama_cpp":
        return LLM_CONFIG_LLAMA_CPP
    elif LLM_TYPE == "ollama":
        return LLM_CONFIG_OLLAMA
    else:
        raise ValueError(f"Invalid LLM_TYPE: {LLM_TYPE}")

These are the phi3 models installed on this Ollama server, seen from the Open-WebUI:

Thank you, Piero

hafeezhmha commented 1 day ago

@PieBru LLM_CONFIG_OLLAMA = { "llm_type": "ollama", "base_url": "http://10.4.0.100:33821", # Ensure this URL is correct "model_name": "research-phi:latest", # Updated model name "temperature": 0.7, "top_p": 0.9, "n_ctx": 2048, # Adjust as needed "context_length": 2048, # Adjust as needed "stop": ["User:", "\n\n"] }

Hello, have you tried updating your model name correctly? Does it work with the proper name?

TheBlewish commented 1 day ago

I tested with Phi3 not 3.5, but yeah as the other helpful person pointed out you need to change your model name in the llm_config.py file for it to work to the one you intend to use with Ollama, also you need to make a custom model as I layed out in the github instructions if you want it to have enough context to work properly!

PieBru commented 3 hours ago

I created the custom model with ollama create research-phi3 -f MODELFILE, where MODELFILE content is:

FROM phi3:3.8b-mini-128k-instruct-q6_K

PARAMETER num_ctx 38000

As shown on my screenshot, the phi3:3.8b-mini-128k-instruct-q6_K is available to the WebUI, thus is runnable by Ollama.

@hafeezhmha I used "model_name": "research-phi3" that should be legit for Ollama. No difference using "model_name": "research-phi3:latest". I never used or created "research-phi:latest" without the version number.

This is the phi models available on my Ollama server:

ollama list | grep phi
research-phi3:latest                                             283806c9c3e0    3.1 GB    41 hours ago    
phi3:3.8b-mini-128k-instruct-q6_K                                90771235c599    3.1 GB    41 hours ago    
phi3.5:3.8b-mini-instruct-q6_K                                   64777e5c6803    3.1 GB    2 days ago

Please suggest any test I can do to narrow down this blocking problem.

PieBru commented 2 hours ago

As the context length definition is a bit confusing to me, I tried 38000 in the config as it was suggested for the MODELFILE, but I'm still stuck.

# LLM settings for Ollama
LLM_CONFIG_OLLAMA = {
    "llm_type": "ollama",
    "base_url": "http://10.4.0.100:33821",  # default Ollama server URL
    "model_name": "research-phi3:latest",  # Replace with your Ollama model name
    "temperature": 0.7,
    "top_p": 0.9,
    "n_ctx": 38000, #55000,
    "num_ctx": 38000,
    "context_length": 38000, #55000,
    "stop": ["User:", "\n\n"]
}

At the program start, nvtop on the Ollama server showed a little activity, but after a little while it terminates with the 500 error.

@hafeezhmha I tried with a wrong model name and the error is immediately the expected 404.

TheBlewish / Automated-AI-Web-Researcher-Ollama

Error 500 from Ollama #12