Open vale46n1 opened 7 months ago
Would like to hear more about your usecase. If you want to mess around locally, you'd just change this line. That's still going to pass gpt-3.5-turbo
etc as a model name, to make this work generically we would need a uniform way to get a list of what models are available. This is essentially what I'm doing with the Ollama integration.
I've been thinking about adding support for tools like Replicated or Together.ai which would make using open source models much simpler / faster. Are you just running a lammacpp model independent of ollama?
I'm using StudioLM for same test. openrouter.ai is another good and cheap alternative to be use (and I would say to be integrated)
to make this work generically we would need a uniform way to get a list of what models are available.
LMstudio server works when there is a model already uploaded, so it's not like the Ollama server can run without a model, we just need a connection, and the user can change the model from the Lmstudio or Ooba, etc, I changed the base_url value to match Lmstudio but still connect to Olama .
which is the best local LLM to run openui?
which is the best local LLM to run openui?
@vanpelt mentioned LLava, So try one of the V1.6 7B,13B, 34B
Would like to hear more about your usecase. If you want to mess around locally, you'd just change this line. That's still going to pass
gpt-3.5-turbo
etc as a model name, to make this work generically we would need a uniform way to get a list of what models are available. This is essentially what I'm doing with the Ollama integration.I've been thinking about adding support for tools like Replicated or Together.ai which would make using open source models much simpler / faster. Are you just running a lammacpp model independent of ollama?
According to LMStudio official tech doc [https://lmstudio.ai/docs/local-server]: Check which models are currently loaded
curl http://localhost:1234/v1/models
Response (following OpenAI's format)
{
"data": [
{
"id": "TheBloke/phi-2-GGUF/phi-2.Q4_K_S.gguf",
"object": "model",
"owned_by": "organization-owner",
"permission": [
{}
]
},
{
"id": "lmstudio-ai/gemma-2b-it-GGUF/gemma-2b-it-q4_k_m.gguf",
"object": "model",
"owned_by": "organization-owner",
"permission": [
{}
]
}
],
"object": "list"
}%
In this case both TheBloke/phi-2-GGUF and lmstudio-ai/gemma-2b-it-GGUF are loaded
Can we add a way to use a local API as llm? Python code should be:
client = OpenAI( api_key="",
Change the API base URL to the local interference API
)
Would be similar to what already provided with ollama