OllamaLLM Connection refused from within docker container while OllamaEmbeddings works The base_url is custom and same for both.

yogesh-bansal commented 1 month ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain_community.embeddings import OllamaEmbeddings
from langchain_ollama import OllamaLLM
embeddings_model = OllamaEmbeddings(base_url = "http://192.168.11.98:9000", model="nomic-embed-text:v1.5", num_ctx=4096)
embeddings_model.embed_query("Test")

## LLM Model
llm_model =  OllamaLLM(base_url = "http://192.168.11.98:9000",model="llama3.1:8b",num_ctx = 2048)
llm_model.invoke("Test")

FROM ubuntu

# Install Prequisites
RUN apt-get update && apt-get install -y build-essential cmake gfortran libcurl4-openssl-dev libssl-dev libxml2-dev python3-dev python3-pip python3-venv
RUN pip install langchain langchain-core langchain-community langchain-experimental langchain-chroma langchain_ollama pandas --break-system-packages

Error Message and Stack Trace (if applicable)

from langchain_community.embeddings import OllamaEmbeddings

from langchain_ollama import OllamaLLM embeddings_model = OllamaEmbeddings(base_url = "http://192.168.11.98:9000", model="nomic-embed-text:v1.5", num_ctx=4096) embeddings_model.embed_query("Test") [0.8171377182006836, 0.7424322366714478, -3.6913845539093018, -0.5350275635719299, 1.98311185836792, -0.08007726818323135, 0.7974349856376648, -0.5946609377861023, 1.4877475500106812, -0.8044648766517639, 0.38856828212738037, 1.0630642175674438, 0.6806553602218628, -0.9530377984046936, -1.4606661796569824, -0.2956351637840271, -0.9512965083122253]

LLM Model

llm_model = OllamaLLM(base_url = "http://192.168.11.98:9000",model="llama3.1:8b",num_ctx = 2048) llm_model.invoke("Test") Traceback (most recent call last): File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions yield File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpx/_transports/default.py", line 233, in handle_request resp = self._pool.handle_request(req) File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 216, in handle_request raise exc from None File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 196, in handle_request response = connection.handle_request( File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 99, in handle_request raise exc File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 76, in handle_request stream = self._connect(request) File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 122, in _connect stream = self._network_backend.connect_tcp(**kwargs) File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 205, in connect_tcp with map_exceptions(exc_map): File "/usr/lib/python3.10/contextlib.py", line 153, in exit self.gen.throw(typ, value, traceback) File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions raise to_exc(exc) from exc httpcore.ConnectError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "", line 1, in File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 346, in invoke self.generate_prompt( File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 703, in generate_prompt return self.generate(prompt_strings, stop=stop, callbacks=callbacks, kwargs) File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 882, in generate output = self._generate_helper( File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 740, in _generate_helper raise e File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 727, in _generate_helper self._generate( File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/langchain_ollama/llms.py", line 268, in _generate final_chunk = self._stream_with_aggregation( File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/langchain_ollama/llms.py", line 236, in _stream_with_aggregation for stream_resp in self._create_generate_stream(prompt, stop, kwargs): File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/langchain_ollama/llms.py", line 186, in _create_generate_stream yield from ollama.generate( File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/ollama/_client.py", line 79, in _stream with self._client.stream(method, url, **kwargs) as r: File "/usr/lib/python3.10/contextlib.py", line 135, in enter return next(self.gen) File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpx/_client.py", line 870, in stream response = self.send( File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpx/_client.py", line 914, in send response = self._send_handling_auth( File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpx/_client.py", line 942, in _send_handling_auth response = self._send_handling_redirects( File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpx/_client.py", line 979, in _send_handling_redirects response = self._send_single_request(request) File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpx/_client.py", line 1015, in _send_single_request response = transport.handle_request(request) File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpx/_transports/default.py", line 232, in handle_request with map_httpcore_exceptions(): File "/usr/lib/python3.10/contextlib.py", line 153, in exit self.gen.throw(typ, value, traceback) File "/root/.virtualenvs/aaveLLM/lib/python3.10/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions raise mapped_exc(message) from exc httpx.ConnectError: [Errno 111] Connection refused

Description

I have the following code inside a docker container python script I am trying to run. While the embedding model works fine, The LLM model returns Connection refused

Both works fine from outside the container though and inside the container as well when run through say curl

root@1fec10f8d40e:/# curl http://192.168.11.98:9000/api/generate -d '{
  "model": "llama3.1:8b",
  "prompt": "Test",
  "stream": false
}'
{"model":"llama3.1:8b","created_at":"2024-08-04T03:49:46.282365097Z","response":"It looks like you want to test me. I'm happy to play along!\n\nHow would you like to proceed? Would you like to:\n\nA) Ask a simple question\nB) Provide a statement and ask for feedback\nC) Engage in a conversation on a specific topic\nD) Something else (please specify)\n\nLet me know, and we can get started!","done":true,"done_reason":"stop","context":[128006,882,128007,271,2323,128009,128006,78191,128007,271,2181,5992,1093,499,1390,311,1296,757,13,358,2846,6380,311,1514,3235,2268,4438,1053,499,1093,311,10570,30,19418,499,1093,311,1473,32,8,21069,264,4382,3488,198,33,8,40665,264,5224,323,2610,369,11302,198,34,8,3365,425,304,264,10652,389,264,3230,8712,198,35,8,25681,775,320,31121,14158,696,10267,757,1440,11,323,584,649,636,3940,0],"total_duration":2073589200,"load_duration":55691013,"prompt_eval_count":11,"prompt_eval_duration":32157000,"eval_count":76,"eval_duration":1943850000}

I have checked the model names etc and they are correct and since it works outside the python langchain environment. The issue appears when the OllamaLLM is run inside container environment.

I have attached the Dockerfile, Cleaned it out for reproducing the issue. Attaching to docker with docker run -it image bash to run the python code and the error appears

System Info

pip freeze | grep langchai langchain==0.2.12 langchain-chroma==0.1.2 langchain-community==0.2.11 langchain-core==0.2.28 langchain-experimental==0.0.64 langchain-ollama==0.1.1 langchain-text-splitters==0.2.2

sbtkd85 commented 1 month ago

I'm seeing similar issues with langchain-core v0.2.29 and langchain-ollama v0.1.1. Looking at the library, I don't see base_url parameter being honored anywhere, and I've confirmed that curl works for my deployment as well.

I also tried setting OLLAMA_API_URL environment variable with no luck.

This issue seems like it could be related as well if it is an issue with how base_url parameter is being handled: https://github.com/langchain-ai/langchain/issues/25160

danielorp commented 3 weeks ago

Also happening to me. I was able to do a sanity check against longchain_community code:

from langchain_community.llms.ollama import Ollama
model = Ollama(model="tinyllama", base_url="http://ollama:11434")
model.invoke("Hi there")

Works perfectly fine (also curl http://ollama:11434), while OllamaLLM refuses my connection with the same parameters.

pf3r commented 2 weeks ago

Also happening to me. I was able to do a sanity check against longchain_community code:
from langchain_community.llms.ollama import Ollama
model = Ollama(model="tinyllama", base_url="http://ollama:11434")
model.invoke("Hi there")
Works perfectly fine (also curl http://ollama:11434), while OllamaLLM refuses my connection with the same parameters.

Thank you very much! I wasted many hours on this. I confirm that it works.

ntelo007 commented 2 weeks ago

I am using ChatOllama. I am having the same issue. What would you recommend me doing? In the base_url I provide the http://ollama:11434.

ntelo007 commented 2 weeks ago

In addition, this solution doesn't include the bind_tools functionality.

langchain-ai / langchain