lmstudio-ai / lmstudio-bug-tracker

Bug tracking for the LM Studio desktop application
10 stars 3 forks source link

Embedding issue with 0.3.3 #142

Closed usernotnull closed 1 month ago

usernotnull commented 1 month ago

I'm using LM Studio to do embeddings with Obsidian. With version 0.3.3, the embeddings stopped working.

Here is a sample error I'm getting:

2024-10-05 15:17:50  [INFO] Received request to embed multiple:  ["A Folder > A Title\nBLOCK NOT FOUND (no line_start)"]
2024-10-05 15:17:50 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:17:50  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:17:50  [INFO] Received request to embed multiple:  ["Another Folder> Another Title:\n---\nup: [\"[[Somewhere]]\"]\nrelated: []\ntags: [o..."]
2024-10-05 15:17:50 [DEBUG]
llama_decode_internal: n_tokens == 0
llama_decode: failed to decode, ret = -1
2024-10-05 15:17:50 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-05 15:17:50 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}

The embedding was fine up to 0.3.2.

You can find the issue here: https://github.com/brianpetro/obsidian-smart-connections/issues/645#issuecomment-2395250030

yagil commented 1 month ago

Thanks @usernotnull we will investigate

mattjcly commented 1 month ago

Hi @usernotnull, would you be able to provide me with the following information:

  1. What operating system are you using? (Windows, MacOS, Linux)
  2. What LM Runtime are you using? You can find this information from the "Developer"->"LM Runtimes" page image
  3. If possible, would you be able to provide me with the entire exact input string from 2024-10-05 15:17:50 [INFO] Received request to embed multiple: ["Another Folder> Another Title:\n---\nup: [\"[[Somewhere]]\"]\nrelated: []\ntags: [o..."] that results in this error, so that we can try to reproduce the error ourselves?

Thanks for reporting this, we want to resolve this ASAP

jagai commented 1 month ago

@mattjcly This is occurring so far on MacOS Sequoia (myself) and Win11 (@usernotnull).

As for myself, I'm using the Metal llama.cpp Runtime 1.1.9. Tested on 1.1.8 too and fails to process when upgrading to LM Studio 0.3.3.

2024-10-07 02:36:55  [INFO] Received request to embed multiple:  ["resources > arrayable > dataset-feedbacks:\n---\ntitle: Feedback Datasets\nslug: research-employee-enga..."]
2024-10-07 02:36:55 [DEBUG] llama_decode_internal: n_tokens == 0llama_decode: failed to decode, ret = -1
2024-10-07 02:36:55 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-07 02:36:55 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}

This is the file it fails on: pd-research-employee_net_promoter_score.md.md.zip

Let me know if there's anything else that can be done to help 🙏🏻

usernotnull commented 1 month ago

Correct. Win11.

{
  "result": {
    "code": "Success",
    "message": ""
  },
  "cpuInfo": {
    "architecture": "x86_64",
    "supportedInstructionSetExtensions": [
      "AVX",
      "AVX2"
    ]
  }
}
gguf
runtime: CUDA llama.cpp (Windows) v1.1.10
{
  "result": {
    "code": "Success",
    "message": ""
  },
  "gpuInfo": [
    {
      "name": "NVIDIA GeForce RTX 3060 Laptop GPU",
      "deviceId": 0,
      "totalMemoryCapacityBytes": 6441926656,
      "dedicatedMemoryCapacityBytes": 0,
      "integrationType": "Discrete",
      "detectionPlatform": "CUDA",
      "detectionPlatformVersion": "",
      "otherInfo": {}
    }
  ]
}
VRAM
gguf
runtime: CUDA llama.cpp (Windows) v1.1.10
vramCapacity: 6.00 GB
mattjcly commented 1 month ago

@jagai and @usernotnull much appreciate the quick response and info! Very useful.

Could I ask, how exactly are you calling into the LM Studio embeddings API through Obsidian with this PDF? Or even simpler, how could I reproduce the exact same situation by using Obsidian and this PDF myself?

usernotnull commented 1 month ago

@mattjcly we're actually unofficially editing a plugin (smart connections) to call for the embeddings through LM Studio because doing the embeddings with obsidian doesn't take advantage of the full power of the GPU and embeddings takes ages. Enabling custom endpoints officially is on the road map, but resolving this issue that arose in 0.3.3 early is appreciated.

The thread describes the steps to enable obsidian's smart connections plugin to do so. We can assist if you have any questions.

mattjcly commented 1 month ago

We have a reproduction of the issue, and a fix. Thanks for all your help. Will be fixed and released ASAP.

jagai commented 1 month ago

@mattjcly, thanks a ton for handling this so quickly! I noticed the update while I was working on providing more details 😄

For your information, or for future reference, I’m posting a link to the modifications made in Smart Connections to offload embeddings to LM Studio via the plugin.

https://github.com/brianpetro/obsidian-smart-connections/issues/645#issuecomment-2371889927

yagil commented 1 month ago

Should be fixed in 0.3.4, available at https://lmstudio.ai/download

usernotnull commented 1 month ago

Closing issue. Tx for the prompt support gents.