LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
4.81k stars 343 forks source link

Horde worker not working under linux #1069

Closed ByteBrigand closed 4 weeks ago

ByteBrigand commented 4 weeks ago

The Horde worker is able to accept jobs and generate tokens, but it is unable to send the tokens back to the AI Horde.

Environment:

Command Used: (Commands have been anonymized)

./koboldcpp-linux-x64-cuda1210 llama-2-7b.gguf --skiplauncher --usecublas lowvram --multiuser 2 --gpulayers -1 --hordeworkername "worker_name" --hordekey "your_horde_key" --hordemaxctx 2048 --hordegenlen 512 --hordemodelname "llama-2-7b" --remotetunnel

Console Output: (Console output has been anonymized)

Load Text Model OK: True
Embedded KoboldAI Lite loaded.
Embedded API docs loaded.
===
Embedded Horde Worker 'worker_name' Starting...
(To use your own Horde Bridge/Scribe worker instead, don't set your API key)
Downloading Cloudflare Tunnel for Linux...

Attempting to start tunnel thread...
Starting Cloudflare Tunnel for Linux, please wait...
[06:42:30] Embedded Horde Worker 'worker_name' is started.
Your remote Kobold API can be found at https://your-remote-url/api
Your remote OpenAI Compatible API can be found at https://your-remote-url/v1
======

Your remote tunnel is ready, please connect to https://your-remote-url

CtxLimit:102/576, Amt:100/100, Process:0.19s (95.5ms/T = 10.47T/s), Generate:9.86s (98.6ms/T = 10.14T/s), Total:10.06s (9.95T/s)
CtxLimit:102/576, Amt:100/100, Process:0.12s (115.0ms/T = 8.70T/s), Generate:9.91s (99.1ms/T = 10.09T/s), Total:10.02s (9.98T/s)
[06:47:01] Job received from https://aihorde.net for 512 tokens and 2048 max context. Starting generation...

CtxLimit:98/2048, Amt:57/512, Process:0.75s (18.7ms/T = 53.40T/s), Generate:5.74s (100.7ms/T = 9.93T/s), Total:6.49s (8.79T/s)
[06:51:29] Error: HTTP Error 404: NOT FOUND - {"message":"Processing Job with ID <job_id> does not exist. You have requested this URI [/api/v2/generate/text/submit] but did you mean /api/v2/generate/text/submit or /api/v2/generate/submit or /api/v2/generate/text/status/<string:id> ?","rc":"InvalidJobID"}

Make sure your Horde API key and worker name is valid!
[06:51:29] Error, Job submit failed.
LostRuins commented 4 weeks ago

You took 4 minutes to complete that generation. I think that's probably too long and the request expired. Was your terminal paused or something?

ByteBrigand commented 4 weeks ago

You took 4 minutes to complete that generation. I think that's probably too long and the request expired. Was your terminal paused or something?

There were 3 generations. First two I did myself by accessing the local koboldcpp, both took around 10 seconds. Third generation was given by aihorde.net and completed in 6.5 seconds. After the third generation completed, there was no output to the terminal for many minutes (but the app did not freeze nor crash).

LostRuins commented 4 weeks ago

Perhaps the terminal got paused. That sometimes happens when it receives focus.

Try again and see if the same thing happens?

ByteBrigand commented 4 weeks ago

Perhaps the terminal got paused. That sometimes happens when it receives focus.

Try again and see if the same thing happens?

I ran it again, this time in a "screen" session, without remote tunnel.

Load Text Model OK: True
Embedded KoboldAI Lite loaded.
Embedded API docs loaded.
Starting Kobold API on port 5001 at http://localhost:5001/api/
Starting OpenAI Compatible API on port 5001 at http://localhost:5001/v1/
===
Embedded Horde Worker '[Worker Name Redacted]' Starting...
(To use your own Horde Bridge/Scribe worker instead, don't set your API key)======
Please connect to custom endpoint at http://localhost:5001

[16:15:35] Embedded Horde Worker '[Worker Name Redacted]' is started.
[16:19:18] No recent jobs, entering low power mode...

[16:21:29] Job received from https://aihorde.net for 100 tokens and 576 max context. Starting generation...

CtxLimit:103/576, Amt:100/100, Process:0.21s (71.7ms/T = 13.95T/s), Generate:9.94s (99.4ms/T = 10.06T/s), Total:10.16s (9.84T/s)
[16:21:45] Submitted [Job ID Redacted] and earned 1 kudos
[Total:1 kudos, Time:000h:06m:13s, Jobs:1, EarnRate:12 kudos/hr]

[16:22:38] Job received from https://aihorde.net for 256 tokens and 1024 max context. Starting generation...

CtxLimit:222/1024, Amt:181/256, Process:0.78s (19.4ms/T = 51.48T/s), Generate:19.73s (109.0ms/T = 9.17T/s), Total:20.51s (8.82T/s)
[16:23:05] Submitted [Job ID Redacted] and earned 1 kudos
[Total:2 kudos, Time:000h:07m:33s, Jobs:2, EarnRate:19 kudos/hr]

Well. Thanks for helping. Maybe it should be said in the FAQ or something.