Server stops working after a while

nklsbckmnn commented 2 months ago

Using the local server in 0.3.2, I regularly have the problem that the server stops working after a few hundred requests resulting in timeouts. Restarting the server within the app does not help, only restarting the app. In the server log after "Received POST request [...]" I get "Running chat completion on conversation with 1 messages." and then, 20 minutes later, instead of the usual "Generated prediction", I get "Client disconnected. Stopping generation... if the model is busy processing the prompt, it will finish first" immediately followed by "Client disconnected. Stopping generation...".

yagil commented 2 months ago

@nklsbckmnn thanks for the bug report, we'll investigate. Can you please check if there's anything that looks related in the app logs? You can find them by clicking on the bottom left of the screen and then -> open app logs

nklsbckmnn commented 2 months ago

Nothing out of the ordinary and nothing around the time in question.

nklsbckmnn commented 2 months ago

I can reliably reproduce this. I noticed that the status indicator in the list of loaded models was stuck on "Processing". I also noticed that at least in cases where there was no "Accumulating tokens ... (stream = false)" after "Running chat completion", following "Client disconnected. Stopping generation.." there were entries like this in the log:

[LM STUDIO SERVER] [gemma-2-9b-it-q8_0-f16] Generated prediction: {
  "id": "chatcmpl-nc8n345tu5ohg4fz3lc5",
  "object": "chat.completion",
  "created": 1725422160,
  "model": "gemma-2-9b-it-q8_0-f16",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": ""
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  },
  "system_fingerprint": "gemma-2-9b-it-q8_0-f16"
}

When I unloaded the model I got "[ERROR] Model unloaded.. Error Data: n/a, Additional Data: n/a" in the server log. I'm using "bartowski/gemma-2-9b-it-GGUF/gemma-2-9b-it-Q8_0-f16.gguf". Contrary to the last time I encountered this, I was able to return to a working state just by reloading the model, no app restart needed. I will run with verbose logging next. Thanks for looking into this.

lmstudio-ai / lmstudio-bug-tracker

Server stops working after a while #108