langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
51.29k stars 7.39k forks source link

Trying Local OpenAI api failed (ollama / LM Studio) #1886

Closed Namec999 closed 10 months ago

Namec999 commented 10 months ago

Self Checks

Description of the new feature / enhancement

hello trying the new 0.4.1 with local Ollama and liteLLM, using both : Add openAI dify models, and the new compatible OpenAI API tab. but i got each time error, not connected and/or no Model found. while my ollama is working and serving well.

is there any way to use tools like Ollama and/or LM Studio for local inference.

best

Scenario when this would be used?

is there any way to use tools like Ollama and/or LM Studio for local inference.

Supporting information

No response

takatost commented 10 months ago

Ollama seems to have not yet implemented the OpenAI Compatible API. https://github.com/jmorganca/ollama/issues/305 duplicate #1725

takatost commented 10 months ago

And I tried LM Studio's Local Inference Server and it worked well after configuring it to be OpenAI-API-compatible.

ewebgh33 commented 10 months ago

I tried the OpenAI-compatible API from text-generation-webui (Oobabooga) and this also did not work.

I get An error occurred during credentials validation: HTTPConnectionPool(host='127.0.0.1', port=5000): Max retries exceeded with url: /v2/api/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f26a912f940>: Failed to establish a new connection: [Errno 111] Connection refused')) Supposedly this API mimics the OpenAI api standard thought? Should be running on http://127.0.0.1:5000 (or http://127.0.0.1:5000/v1, tried both)

So how can we run this with Ollama or text-generation-webui please?

Their docs: https://github.com/oobabooga/text-generation-webui/blob/main/docs/12%20-%20OpenAI%20API.md

StreamlinedStartup commented 10 months ago

Using LiteLLM with Ollama is working.

takatost commented 10 months ago

I tried the OpenAI-compatible API from text-generation-webui (Oobabooga) and this also did not work.

I get An error occurred during credentials validation: HTTPConnectionPool(host='127.0.0.1', port=5000): Max retries exceeded with url: /v2/api/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f26a912f940>: Failed to establish a new connection: [Errno 111] Connection refused')) Supposedly this API mimics the OpenAI api standard thought? Should be running on http://127.0.0.1:5000 (or http://127.0.0.1:5000/v1, tried both)

So how can we run this with Ollama or text-generation-webui please?

Their docs: https://github.com/oobabooga/text-generation-webui/blob/main/docs/12%20-%20OpenAI%20API.md

How did you deploy Dify? If you're using Docker Compose to deploy it, then the BaseURL cannot be the IP address 127.0.0.1, it needs to be the IP address of the host machine, which is usually 172.17.0.1.

ewebgh33 commented 10 months ago

Yes my local version is the docker version.

Why is the IP of the host machine 172.17.0.1? The host machine is this machine. Sorry, honest question.

Oobabooga is not docker, so the IP is the IP. Why would running docker Dify make the ooba API url into something else? My Ollama is via WSL for windows, so that's yet another thing.

So I've got one "regular" app (conda environ), one WSL app, and one docker app. The docker one (Dify) needs the back end running the LLM from either textgeneration-webui (conda) or Ollama (WSL).

- Background: I'm not a docker pro, I started using docker about a month ago because I've been testing LLM apps and a bunch of them use docker. I've done some coding - personally I've used python for a couple of years on and off, and I know other web stuff (HTML, CSS, JS, PHP, before the web went to react and stuff) but literally never had to use docker until LLMs.

But situations like this, with all these LLM apps running in completely different architectures, but still needing to talk to each other, is new territory for me.

- We may be getting off topic here though :)

takatost commented 10 months ago

Yes my local version is the docker version.

Why is the IP of the host machine 172.17.0.1? The host machine is this machine. Sorry, honest question.

Oobabooga is not docker, so the IP is the IP. Why would running docker Dify make the ooba API url into something else? My Ollama is via WSL for windows, so that's yet another thing.

So I've got one "regular" app (conda environ), one WSL app, and one docker app. The docker one (Dify) needs the back end running the LLM from either textgeneration-webui (conda) or Ollama (WSL).

Background: I'm not a docker pro, I started using docker about a month ago because I've been testing LLM apps and a bunch of them use docker. I've done some coding - personally I've used python for a couple of years on and off, and I know other web stuff (HTML, CSS, JS, PHP, before the web went to react and stuff) but literally never had to use docker until LLMs.

But situations like this, with all these LLM apps running in completely different architectures, but still needing to talk to each other, is new territory for me.

We may be getting off topic here though :)

Docker creates a virtual network for containers to enable isolation and easy communication between them. When a container runs, it gets its own IP address on this virtual network. The address 172.17.0.1 is typically the default gateway for Docker containers, which routes the traffic to the Docker host.

Using 172.17.0.1 instead of 127.0.0.1 (localhost) is crucial because 127.0.0.1 inside a container refers to the container itself, not the Docker host. So, to access services running on the Docker host from a container, 172.17.0.1 is used to correctly route the traffic outside the container to the host.

takatost commented 10 months ago

btw, v0.4.6 released which supported ollama.

ewebgh33 commented 10 months ago

Very awesome, thankyou!

FarVision2 commented 9 months ago

Depending on the age of your windows workstation development environment and other virtual products installed along with a few docker upgrades, may have the virtual Ethernet non default. You can type ipconfig from the command prompt to get the vEthernet listings. I don't see a way of getting this from docker itself but it's easy enough to suss out

there should be the default and then another for WSL

mine were 172.21.208.1 172.24.240.1

both of them worked in the settings window

http://172.24.240.1:11434/ Ollama is running

csningli commented 7 months ago

I met the same issue with dify hosted with docker and solved it now. The solution is easy, and someone has mentioned that the problem is because inside docker, you should connect to host.docker.internal, in order to visit the host of the docker. As a summary, you should add a open-ai-compatible model provider in dify's model providers, and set the api address like

http://host.docker.internal:1234/v1

here I use LMStudio to provide a local access to llama2-7b-chat, and the default port is 1234.