cause: HeadersTimeoutError: Headers Timeout Error

smalik2043 commented 7 months ago

I keep on getting cause: HeadersTimeoutError: Headers Timeout Error when I am trying to request homellm model

cause: HeadersTimeoutError: Headers Timeout Error
       at Timeout.onParserTimeout [as callback] (node:internal/deps/undici/undici:8228:32)
       at Timeout.onTimeout [as _onTimeout] (node:internal/deps/undici/undici:6310:17)
       at listOnTimeout (node:internal/timers:573:17)
       at processTimers (node:internal/timers:514:7) {
     code: 'UND_ERR_HEADERS_TIMEOUT'
   }

Sometimes

 Error: Expected a completed response.
     at Ollama.processStreamableRequest (/usr/src/app/node_modules/ollama/dist/shared/ollama.a247cdd6.cjs:211:15)
     at processTicksAndRejections (node:internal/process/task_queues:95:5)
     at OllamaService.responseFromOllama (/usr/src/app/src/ollama/ollama.service.ts:21:24)
     at AppGateway.pubSubMessageAI (/usr/src/app/src/gateway/app.gateway.ts:98:18)
     at AppGateway.<anonymous> (/usr/src/app/node_modules/@nestjs/websockets/context/ws-proxy.js:11:32)
     at WebSocketsController.pickResult (/usr/src/app/node_modules/@nestjs/websockets/web-sockets-controller.js:91:24)

try {
      const { customerId, message } = data;
      let systemPromptName;
      systemPromptName = data.systemPromptName || 'Al';

      const ollama = new Ollama({ host: 'http://host.docker.internal:11434' });

      const response = await ollama.generate({
        model: 'homellm:latest',
        prompt: message,
        format: 'json',
        stream: false,
        system: `You are 'Al', a helpful AI Assistant that controls the devices in a house. Complete the following task as instructed with the information provided only.
        Services: light.turn_off(), light.turn_on(brightness,rgb_color), fan.turn_on(), fan.turn_off()
        Devices:
        light.office 'Office Light' = on;80%
        fan.office 'Office fan' = off
        light.kitchen 'Kitchen Light' = on;80%;red
        light.bedroom 'Bedroom Light' = off`,
      });
      return response;
    } catch (e) {
      console.log(e);
    }
  }

I was getting responses before but now always getting Headers Timeout error or Expected a completed response.

muditjaju commented 7 months ago

Facing a similar issue

dustinnnnnn commented 7 months ago

Similar issue here. After a month of usage I haven't encountered this issue until recently. Any known solutions?

TypeError: fetch failed
      at Object.fetch (node:internal/deps/undici/undici:14062:11) {
    cause: HeadersTimeoutError: Headers Timeout Error
        at Timeout.onParserTimeout [as callback] (...\node_modules\undici\lib\client.js:1059:28)
        at Timeout.onTimeout [as _onTimeout] (...\node_modules\undici\lib\timers.js:20:13)
        at listOnTimeout (node:internal/timers:564:17)
        at process.processTimers (node:internal/timers:507:7) {
      code: 'UND_ERR_HEADERS_TIMEOUT'
    }
}

knoopx commented 7 months ago

this happens to me when format: json and only with some models.

bralca commented 6 months ago

Happens to me as well! I am using llama3 and it only happens when I send a long message.

This happens only when I use the npm package. I have llama3 installed in my machine and use ollama to run it and it works for the same message.

How to solve this?

isbkch commented 6 months ago

Googling the problem got me here. Same issue using Ollama3 on an M2 Ultra Mac Studio

OB42 commented 6 months ago

same with llama3

pelletier197 commented 6 months ago

Same issue. I have investigated a bit and it seems like this may be an issue with Ollama itself. I checked the server logs.

I'm able to see the pull request

May 17 15:40:36 pop-os ollama[746628]: [GIN] 2024/05/17 - 15:40:36 | 500 |          5m0s |       127.0.0.1 | POST     "/api/pull"

It stops at exactly 5 minutes, which cannot be a coincidence. Either the client times out, or the server times out.

After digging some more, I found this

https://github.com/ollama/ollama/blob/7e1e0086e7d18c943ff403a7ca5c2d9ce39f3f4b/server/routes.go#L57C5-L57C27

The session duration in Ollama is 5 minutes. Sooooo... Don't believe this is an issue with this library per-say. Either this library handles a retry, or we ask Ollama to increse this session time. Whichever is easier.

pelletier197 commented 6 months ago

I've opened this issue on their side. Let's see what they say.

CliffHan commented 5 months ago

Same issue. I have investigated a bit and it seems like this may be an issue with Ollama itself. I checked the server logs.

I'm able to see the pull request
May 17 15:40:36 pop-os ollama[746628]: [GIN] 2024/05/17 - 15:40:36 | 500 |          5m0s |       127.0.0.1 | POST     "/api/pull"
It stops at exactly 5 minutes, which cannot be a coincidence. Either the client times out, or the server times out.

After digging some more, I found this

https://github.com/ollama/ollama/blob/7e1e0086e7d18c943ff403a7ca5c2d9ce39f3f4b/server/routes.go#L57C5-L57C27

The session duration in Ollama is 5 minutes. Sooooo... Don't believe this is an issue with this library per-say. Either this library handles a retry, or we ask Ollama to increse this session time. Whichever is easier.

Looks like ollama will lookup the environment variable OLLAMA_KEEP_ALIVE and convert it to default duration https://github.com/ollama/ollama/blob/7e1e0086e7d18c943ff403a7ca5c2d9ce39f3f4b/server/routes.go#L317

change OLLAMA_KEEP_ALIVE maybe works.

programminghoch10 commented 4 months ago

This error is client side, it has nothing to do with any session timeout on the ollama server. This error is from the fetch function itself giving a timeout because it didn't receive any response from the requested server after 5 minutes.

fetch() is a standard. The standard doesn't say anything about timeouts. It's up to implementers to pick reasonable defaults. source

in my case this issue only appeared when not streaming or when other requests had to be processed first, since without streaming the server doesn't send anything until the entire response is ready.

i only tested it on nodejs, browsers are probably behaving differently

easy solution is to just use another fetch implementation

I tested node-fetch and undici with a request taking over 5 minutes to complete. undici also raises the exact same error, while node-fetch doesn't have this timeout (or it is way higher).

import { Ollama } from "ollama"
import fetch from "node-fetch"

const ollama = new Ollama({
  fetch: fetch as any
})

The as any is only necessary in typescript and has to be there because the fetch method signatures don't match exactly.

To test I started a big request on an underpowered machine and logged start and end time:

start 2024-06-29T20:06:47.856Z
end 2024-06-29T20:24:20.325Z

about 18 minutes request time on node without streaming and without headers timeout error.

be advised in this case changing to node-fetch broke the streaming requests

ollama / ollama-js

cause: HeadersTimeoutError: Headers Timeout Error #72