FR Support local server for embeddings

ArtificialAmateur commented 5 months ago

Jumping off of #302

Like the local server options for Smart Chat, similar work can be done for embeddings.

The OpenAI format API (which LM Studio and Ollama support) is /v1/embeddings

daaain commented 4 months ago

I'd love this, the embedded WASM models don't seem to saturate the CPU / GPU so it takes ages...

brianpetro commented 4 months ago

Makes sense. Thanks for the feature request 😊🌴

jagai commented 1 month ago

@daaain @ArtificialAmateur While this isn't an ideal solution, I did manage to set up gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf on LM Studio to work in @brianpetro incredible smart-connections plugin.

What I've done is essentially eliminate the checks on api.openai.com and instead just refactored it to direct to my local LM Studio server. Just beware that by doing so, you're taking away the ability to use the OpenAI embeddings, because we'll be refactoring the components that connect to them rather than adding on existing functionalities.

This is a quick and dirty fix for those who'd rather handle the embeddings locally, and its far from ideal, but it works really great for my use case.

Enjoy!

Instructions

On main.js, I've refactored as follows:

Before:

var SmartEmbedOpenAIAdapter = class extends SmartEmbedAdapter {
  constructor(smart_embed) {
    super(smart_embed);
    this.model_key = smart_embed.opts.model_key || "text-embedding-ada-002";
    this.endpoint = "https://api.openai.com/v1/embeddings";
    this.max_tokens = 8191;
    this.dims = smart_embed.opts.dims || 1536;
    this.enc = null;
    this.request_adapter = smart_embed.env.opts.request_adapter;
  }

After:

var SmartEmbedOpenAIAdapter = class extends SmartEmbedAdapter {
  constructor(smart_embed) {
    super(smart_embed);
    this.model_key = smart_embed.opts.model_key;
    this.endpoint = "http://127.0.0.1:1234/v1/embeddings";
    this.max_tokens = 2048;
    this.enc = null;
    this.request_adapter = smart_embed.env.opts.request_adapter;
  }

On var models_default, I've added after Xenova/jina-embeddings-v2-base-zh the model to be selectable on the smart-connections plugin

  "gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf": {
    id: "gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf",
    batch_size: 1,
    dims: 512,
    max_tokens: 2048,
    name: "LLM Studio Nomic",
    description: "API, 2,048 tokens, 512 dim",
    endpoint: "http://127.0.0.1:1234/v1/embeddings",
    adapter: "openai"
  },

On var transformers_connector, I've added the JSON to the list of transformers, I'll save you the trouble and provide the entire list to replace.

brianpetro commented 1 month ago

@jagai thanks for sharing this 😊

PS- It will be easier to configure something like this without code in the future.

🌴

usernotnull commented 1 month ago

@daaain @ArtificialAmateur While this isn't an ideal solution, I did manage to set up nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.f32.gguf on LM Studio to work in @brianpetro incredible smart-connections plugin.

What I've done is essentially eliminate the checks on api.openai.com and instead just refactored it to direct to my local LM Studio server. Just beware that by doing so, you're taking away the ability to use the OpenAI embeddings, because we'll be refactoring the components that connect to them rather than adding on existing functionalities.

This is a quick and dirty fix for those who'd rather handle the embeddings locally, and its far from ideal, but it works really great for my use case.

Enjoy! […]

I have tried your code, however during the embedding process, LM studio shows the below error:

2024-10-04 10:11:02 [DEBUG]
llama_decode_internal: n_tokens == 0
llama_decode: failed to decode, ret = -1
2024-10-04 10:11:02 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-04 10:11:02 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}

The .smart-env\multi shows incomplete embedding as many files are only 1kb.

jagai commented 1 month ago

@daaain @ArtificialAmateur While this isn't an ideal solution, I did manage to set up nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.f32.gguf on LM Studio to work in @brianpetro incredible smart-connections plugin. What I've done is essentially eliminate the checks on api.openai.com and instead just refactored it to direct to my local LM Studio server. Just beware that by doing so, you're taking away the ability to use the OpenAI embeddings, because we'll be refactoring the components that connect to them rather than adding on existing functionalities. This is a quick and dirty fix for those who'd rather handle the embeddings locally, and its far from ideal, but it works really great for my use case. Enjoy! […]

I have tried your code, however during the embedding process, LM studio shows the below error:
2024-10-04 10:11:02 [DEBUG]
llama_decode_internal: n_tokens == 0
llama_decode: failed to decode, ret = -1
2024-10-04 10:11:02 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-04 10:11:02 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}
The .smart-env\multi shows incomplete embedding as many files are only 1kb.

I'll need a little bit more info on this if possible.

Could you share which embedding model did you try, along with the version of Smart Connections? I'll do my best to help

usernotnull commented 1 month ago

@jagai I am using smart-connections version 2.2.79 (although one literally just got pushed 2.2.80 but it doesn't affect our discussion).

This is my model loaded in LM Studio: Screenshot 2024-10-04 163209

This is the obsidian settings:

And the main.js was edited exactly as you documented. I changed the tokens to 2048 in the object and the JSON later, thought maybe it would help, but didn't.

Here's a txt of the js: main.txt

The embedding error is happening on certain files, but it's hard to figure out the issue as I have lots of files and I couldn't reach a point where I got 0 errors yet.

EDIT: I have renamed the files, removed metadata, cleaned the texts removing all that break json (,.\/*? etc), still getting the same issue. So the issue is not due to the content of the notes.

brianpetro commented 1 month ago

@usernotnull that's cool, thanks for sharing 🌴

jagai commented 1 month ago

@jagai I am using smart-connections version 2.2.79 (although one literally just got pushed 2.2.80 but it doesn't affect our discussion).

This is my model loaded in LM Studio:

This is the obsidian settings:

And the main.js was edited exactly as you documented. I changed the tokens to 2048 in the object and the JSON later, thought maybe it would help, but didn't.

Here's a txt of the js: main.txt

The embedding error is happening on certain files, but it's hard to figure out the issue as I have lots of files and I couldn't reach a point where I got 0 errors yet.

EDIT: I have renamed the files, removed metadata, cleaned the texts removing all that break json (,./*? etc), still getting the same issue. So the issue is not due to the content of the notes.

Could you try switching on LM Studio to gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf, give it another go and let me know how it goes?

usernotnull commented 1 month ago

@jagai unfortunately same issue:

2024-10-05 15:17:50  [INFO] Received request to embed multiple:  ["A Folder > A Title\nBLOCK NOT FOUND (no line_start)"]
2024-10-05 15:17:50 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:17:50  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:17:50  [INFO] Received request to embed multiple:  ["Another Folder> Another Title:\n---\nup: [\"[[Somewhere]]\"]\nrelated: []\ntags: [o..."]
2024-10-05 15:17:50 [DEBUG]
llama_decode_internal: n_tokens == 0
llama_decode: failed to decode, ret = -1
2024-10-05 15:17:50 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-05 15:17:50 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}

I also notice the issue with any local embedding model: BLOCK NOT FOUND (no line_start)

I went ahead and tested it on a sandbox vault, same issue:

2024-10-05 15:36:32  [INFO] Received request to embed multiple:  ["Plugins make Obsidian special for you:\nWe started making Obsidian with plugins in mind because every..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Plugins make Obsidian special for you\nWe started making Obsidian with plugins in mind because everyo..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Plugins make Obsidian special for you\n## Wild community plugins\r\n\r\nPlugins not just give Obsidian mo..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Vault is just a local folder:\nDifferent than most note-taking apps out there, an Obsidian vault is n..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Vault is just a local folder\nDifferent than most note-taking apps out there, an Obsidian vault is no..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Start Here:\nHi, welcome to Obsidian!\n\n---\n\n## I’m interested in Obsidian\n\nFirst of all, tell me a li..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Start Here\nHi, welcome to Obsidian!\n\n---\n\n## I’m interested in Obsidian\n\nFirst of all, tell me a lit..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Start Here\n---\n\n## I’m interested in Obsidian\n\nFirst of all, tell me a little bit about what's your ..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Start Here\n## What is this place?\n\nThis is a sandbox vault in which you can test various functionali..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Adventurer > From plain-text note-taking:\nObsidian is similar to plain-text based note-taking apps i..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Adventurer > From plain-text note-taking\nObsidian is similar to plain-text based note-taking apps in..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Adventurer > From standard note-taking:\nGreat, that means you should already be familiar with taking..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Adventurer > From standard note-taking\nGreat, that means you should already be familiar with taking ..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Adventurer > No prior experience:\nThere are plenty of note-taking apps out there, so congratulations..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Adventurer > No prior experience\nThere are plenty of note-taking apps out there, so congratulations ..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Formatting > Callout:\nAs of v0.14.0, Obsidian supports callout blocks, sometimes called \"admonitions..."]
2024-10-05 15:36:33 [DEBUG]
llama_decode_internal: n_tokens == 0
llama_decode: failed to decode, ret = -1
2024-10-05 15:36:33 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-05 15:36:33 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}

jagai commented 1 month ago

@usernotnull I couldn't reproduce the errors on my end. I'm not entirely sure its related to Obsidian or Smart Connections. Could be something to do with LM Studio, but I'm really not sure.

usernotnull commented 1 month ago

@usernotnull I couldn't reproduce the errors on my end. I'm not entirely sure its related to Obsidian or Smart Connections. Could be something to do with LM Studio, but I'm really not sure.

Which OS are you on? Mine is win11.

jagai commented 1 month ago

@usernotnull I couldn't reproduce the errors on my end. I'm not entirely sure its related to Obsidian or Smart Connections. Could be something to do with LM Studio, but I'm really not sure.

Which OS are you on? Mine is win11.

I'm on macOS Sequoia, using Obsidian with my Macbook Air M1... Would be even more difficult for me to help as I've never tried running Obsidian or LM Studio on Windows to be honest ☹️

jagai commented 1 month ago

@usernotnull A long shot, but since you're on Windows, perhaps giving the mixedbread-ai/mxbai-embed-large-v1 model a shot might yield better results?

jagai commented 1 month ago

@usernotnull I've managed to narrow this down to LM Studio 0.3.3. For some reason it causes the models to fail embedding. Tested on LM Studio 0.3.2 and Smart Connections 2.2.81.

You can find LM Studio 0.3.2 at the bottom of the download page (https://lmstudio.ai/download).

Let me know if this works 😃

usernotnull commented 1 month ago

@usernotnull I've managed to narrow this down to LM Studio 0.3.3. For some reason it causes the models to fail embedding. Tested on LM Studio 0.3.2 and Smart Connections 2.2.81.

You can find LM Studio 0.3.2 at the bottom of the download page (https://lmstudio.ai/download).

Let me know if this works 😃

You did it 🏆 Thanks :)

Any idea if LM Studio is aware of this issue?

jagai commented 1 month ago

@usernotnull I've managed to narrow this down to LM Studio 0.3.3. For some reason it causes the models to fail embedding. Tested on LM Studio 0.3.2 and Smart Connections 2.2.81. You can find LM Studio 0.3.2 at the bottom of the download page (https://lmstudio.ai/download). Let me know if this works 😃

You did it 🏆 Thanks :)

Any idea if LM Studio is aware of this issue?

Glad it works! Woohoo 🥳 I'm not sure LM Studio is aware of the issue though. Probably would be a good idea to let them know 😄

usernotnull commented 1 month ago

The LM Studio issue has been resolved. @jagai 's temporary workaround now works well for local embeddings.

brianpetro / obsidian-smart-connections

FR Support local server for embeddings #645

Instructions