Open ArtificialAmateur opened 5 months ago
I'd love this, the embedded WASM models don't seem to saturate the CPU / GPU so it takes ages...
Makes sense. Thanks for the feature request šš“
@daaain @ArtificialAmateur While this isn't an ideal solution, I did manage to set up gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf
on LM Studio to work in @brianpetro incredible smart-connections plugin.
What I've done is essentially eliminate the checks on api.openai.com
and instead just refactored it to direct to my local LM Studio server. Just beware that by doing so, you're taking away the ability to use the OpenAI embeddings, because we'll be refactoring the components that connect to them rather than adding on existing functionalities.
This is a quick and dirty fix for those who'd rather handle the embeddings locally, and its far from ideal, but it works really great for my use case.
Enjoy!
On main.js
, I've refactored as follows:
Before:
var SmartEmbedOpenAIAdapter = class extends SmartEmbedAdapter {
constructor(smart_embed) {
super(smart_embed);
this.model_key = smart_embed.opts.model_key || "text-embedding-ada-002";
this.endpoint = "https://api.openai.com/v1/embeddings";
this.max_tokens = 8191;
this.dims = smart_embed.opts.dims || 1536;
this.enc = null;
this.request_adapter = smart_embed.env.opts.request_adapter;
}
After:
var SmartEmbedOpenAIAdapter = class extends SmartEmbedAdapter {
constructor(smart_embed) {
super(smart_embed);
this.model_key = smart_embed.opts.model_key;
this.endpoint = "http://127.0.0.1:1234/v1/embeddings";
this.max_tokens = 2048;
this.enc = null;
this.request_adapter = smart_embed.env.opts.request_adapter;
}
On var models_default
, I've added after Xenova/jina-embeddings-v2-base-zh
the model to be selectable on the smart-connections plugin
"gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf": {
id: "gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf",
batch_size: 1,
dims: 512,
max_tokens: 2048,
name: "LLM Studio Nomic",
description: "API, 2,048 tokens, 512 dim",
endpoint: "http://127.0.0.1:1234/v1/embeddings",
adapter: "openai"
},
On var transformers_connector
, I've added the JSON to the list of transformers, I'll save you the trouble and provide the entire list to replace.
var transformers_connector = '// models.json\nvar models_default = {\n "TaylorAI/bge-micro-v2": {\n model_key: "TaylorAI/bge-micro-v2",\n batch_size: 1,\n dims: 384,\n max_tokens: 512,\n name: "BGE-micro-v2",\n description: "Local, 512 tokens, 384 dim",\n adapter: "transformers"\n },\n "andersonbcdefg/bge-small-4096": {\n model_key: "andersonbcdefg/bge-small-4096",\n batch_size: 1,\n dims: 384,\n max_tokens: 4096,\n name: "BGE-small-4K",\n description: "Local, 4,096 tokens, 384 dim",\n adapter: "transformers"\n },\n "Xenova/jina-embeddings-v2-base-zh": {\n model_key: "Xenova/jina-embeddings-v2-base-zh",\n batch_size: 1,\n dims: 512,\n max_tokens: 8192,\n name: "Jina-v2-base-zh-8K",\n description: "Local, 8,192 tokens, 512 dim, Chinese/English bilingual",\n adapter: "transformers"\n },\n "gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf": {\n id: "gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf",\n batch_size: 1,\n dims: 512,\n max_tokens: 2048,\n name: "LLM Studio Nomic", \n description: "API, 2,048 tokens, 512 dim", \n endpoint: "http://127.0.0.1:1234/v1/embeddings",\n adapter: "openai"\n },\n "text-embedding-3-small": {\n model_key: "text-embedding-3-small",\n batch_size: 50,\n dims: 1536,\n max_tokens: 8191,\n name: "OpenAI Text-3 Small",\n description: "API, 8,191 tokens, 1,536 dim",\n endpoint: "https://api.openai.com/v1/embeddings",\n adapter: "openai"\n },\n "text-embedding-3-large": {\n model_key: "text-embedding-3-large",\n batch_size: 50,\n dims: 3072,\n max_tokens: 8191,\n name: "OpenAI Text-3 Large",\n description: "API, 8,191 tokens, 3,072 dim",\n endpoint: "https://api.openai.com/v1/embeddings",\n adapter: "openai"\n },\n "text-embedding-3-small-512": {\n model_key: "text-embedding-3-small",\n batch_size: 50,\n dims: 512,\n max_tokens: 8191,\n name: "OpenAI Text-3 Small - 512",\n description: "API, 8,191 tokens, 512 dim",\n endpoint: "https://api.openai.com/v1/embeddings",\n adapter: "openai"\n },\n "text-embedding-3-large-256": {\n model_key: "text-embedding-3-large",\n batch_size: 50,\n dims: 256,\n max_tokens: 8191,\n name: "OpenAI Text-3 Large - 256",\n description: "API, 8,191 tokens, 256 dim",\n endpoint: "https://api.openai.com/v1/embeddings",\n adapter: "openai"\n },\n "text-embedding-ada-002": {\n model_key: "text-embedding-ada-002",\n batch_size: 50,\n dims: 1536,\n max_tokens: 8191,\n name: "OpenAI Ada",\n description: "API, 8,191 tokens, 1,536 dim",\n endpoint: "https://api.openai.com/v1/embeddings",\n adapter: "openai"\n },\n "Xenova/jina-embeddings-v2-small-en": {\n model_key: "Xenova/jina-embeddings-v2-small-en",\n batch_size: 1,\n dims: 512,\n max_tokens: 8192,\n name: "Jina-v2-small-en",\n description: "Local, 8,192 tokens, 512 dim",\n adapter: "transformers"\n },\n "nomic-ai/nomic-embed-text-v1.5": {\n model_key: "nomic-ai/nomic-embed-text-v1.5",\n batch_size: 1,\n dims: 256,\n max_tokens: 8192,\n name: "Nomic-embed-text-v1.5",\n description: "Local, 8,192 tokens, 256 dim",\n adapter: "transformers"\n },\n "Xenova/bge-small-en-v1.5": {\n model_key: "Xenova/bge-small-en-v1.5",\n batch_size: 1,\n dims: 384,\n max_tokens: 512,\n name: "BGE-small",\n description: "Local, 512 tokens, 384 dim",\n adapter: "transformers"\n },\n "nomic-ai/nomic-embed-text-v1": {\n model_key: "nomic-ai/nomic-embed-text-v1",\n batch_size: 1,\n dims: 768,\n max_tokens: 2048,\n name: "Nomic-embed-text",\n description: "Local, 2,048 tokens, 768 dim",\n adapter: "transformers"\n }\n};\n\n// smart_embed_model.js\nvar SmartEmbedModel = class _SmartEmbedModel {\n /**\n * Create a SmartEmbed instance.\n * @param {string} env - The environment to use.\n * @param {object} opts - Full model configuration object or at least a model_key and adapter\n */\n constructor(env, opts = {}) {\n this.env = env;\n this.opts = {\n ...models_default[opts.embed_model_key],\n ...opts\n };\n console.log(this.opts);\n if (!this.opts.adapter)\n return console.warn("SmartEmbedModel adapter not set");\n if (!this.env.opts.smart_embed_adapters[this.opts.adapter])\n return console.warn(
SmartEmbedModel adapter ${this.opts.adapter} not found);\n this.opts.use_gpu = !!navigator.gpu && this.opts.gpu_batch_size !== 0;\n if (this.opts.adapter === "transformers" && this.opts.use_gpu)\n this.opts.batch_size = this.opts.gpu_batch_size || 10;\n this.adapter = new this.env.opts.smart_embed_adapters[this.opts.adapter](this);\n }\n /**\n * Used to load a model with a given configuration.\n * @param {*} env \n * @param {*} opts \n */\n static async load(env, opts = {}) {\n try {\n const model2 = new _SmartEmbedModel(env, opts);\n await model2.adapter.load();\n env.smart_embed_active_models[opts.embed_model_key] = model2;\n return model2;\n } catch (error) {\n console.error(
Error loading model ${opts.model_key}:, error);\n return null;\n }\n }\n /**\n * Count the number of tokens in the input string.\n * @param {string} input - The input string to process.\n * @returns {Promise<number>} A promise that resolves with the number of tokens.\n */\n async count_tokens(input) {\n return this.adapter.count_tokens(input);\n }\n /**\n * Embed the input into a numerical array.\n * @param {string|Object} input - The input to embed. Can be a string or an object with an "embed_input" property.\n * @returns {Promise<Object>} A promise that resolves with an object containing the embedding vector at
vecand the number of tokens at
tokens.\n */\n async embed(input) {\n if (typeof input === "string")\n input = { embed_input: input };\n return (await this.embed_batch([input]))[0];\n }\n /**\n * Embed a batch of inputs into arrays of numerical arrays.\n * @param {Array<string|Object>} inputs - The array of inputs to embed. Each input can be a string or an object with an "embed_input" property.\n * @returns {Promise<Array<Object>>} A promise that resolves with an array of objects containing
vecand
tokensproperties.\n */\n async embed_batch(inputs) {\n return await this.adapter.embed_batch(inputs);\n }\n get batch_size() {\n return this.opts.batch_size || 1;\n }\n get max_tokens() {\n return this.opts.max_tokens || 512;\n }\n};\n\n// adapters/_adapter.js\nvar SmartEmbedAdapter = class {\n constructor(smart_embed) {\n this.smart_embed = smart_embed;\n }\n async load() {\n throw new Error("Not implemented");\n }\n async count_tokens(input) {\n throw new Error("Not implemented");\n }\n async embed(input) {\n throw new Error("Not implemented");\n }\n async embed_batch(input) {\n throw new Error("Not implemented");\n }\n};\n\n// adapters/transformers.js\nvar SmartEmbedTransformersAdapter = class extends SmartEmbedAdapter {\n constructor(smart_embed) {\n super(smart_embed);\n this.model = null;\n this.tokenizer = null;\n }\n get batch_size() {\n if (this.use_gpu && this.smart_embed.opts.gpu_batch_size)\n return this.smart_embed.opts.gpu_batch_size;\n return this.smart_embed.opts.batch_size || 1;\n }\n get max_tokens() {\n return this.smart_embed.opts.max_tokens || 512;\n }\n get use_gpu() {\n return this.smart_embed.opts.use_gpu || false;\n }\n async load() {\n const { pipeline, env, AutoTokenizer } = await import("@xenova/transformers");\n env.allowLocalModels = false;\n const pipeline_opts = {\n quantized: true\n };\n if (this.use_gpu) {\n console.log("[Transformers] Using GPU");\n pipeline_opts.device = "webgpu";\n pipeline_opts.dtype = "fp32";\n } else {\n console.log("[Transformers] Using CPU");\n env.backends.onnx.wasm.numThreads = 8;\n }\n this.model = await pipeline("feature-extraction", this.smart_embed.opts.model_key, pipeline_opts);\n this.tokenizer = await AutoTokenizer.from_pretrained(this.smart_embed.opts.model_key);\n }\n async count_tokens(input) {\n if (!this.tokenizer)\n await this.load();\n const { input_ids } = await this.tokenizer(input);\n return { tokens: input_ids.data.length };\n }\n async embed_batch(inputs) {\n if (!this.model)\n await this.load();\n const filtered_inputs = inputs.filter((item) => item.embed_input?.length > 0);\n if (!filtered_inputs.length)\n return [];\n if (filtered_inputs.length > this.batch_size) {\n throw new Error(
Input size (${filtered_inputs.length}) exceeds maximum batch size (${this.batch_size}));\n }\n const tokens = await Promise.all(filtered_inputs.map((item) => this.count_tokens(item.embed_input)));\n const embed_inputs = await Promise.all(filtered_inputs.map(async (item, i) => {\n if (tokens[i].tokens < this.max_tokens)\n return item.embed_input;\n let token_ct = tokens[i].tokens;\n let truncated_input = item.embed_input;\n while (token_ct > this.max_tokens) {\n const pct = this.max_tokens / token_ct;\n const max_chars = Math.floor(truncated_input.length * pct * 0.9);\n truncated_input = truncated_input.substring(0, max_chars) + "...";\n token_ct = (await this.count_tokens(truncated_input)).tokens;\n }\n tokens[i].tokens = token_ct;\n return truncated_input;\n }));\n try {\n const resp = await this.model(embed_inputs, { pooling: "mean", normalize: true });\n return filtered_inputs.map((item, i) => {\n item.vec = Array.from(resp[i].data).map((val) => Math.round(val * 1e8) / 1e8);\n item.tokens = tokens[i].tokens;\n return item;\n });\n } catch (err) {\n console.error("error_embedding_batch", err);\n return Promise.all(filtered_inputs.map((item) => this.embed(item.embed_input)));\n }\n }\n};\n\n// build/transformers_iframe_script.js\nvar model = null;\nvar smart_env = {\n smart_embed_active_models: {},\n opts: {\n smart_embed_adapters: {\n transformers: SmartEmbedTransformersAdapter\n }\n }\n};\nasync function processMessage(data) {\n const { method, params, id, iframe_id } = data;\n try {\n let result;\n switch (method) {\n case "init":\n console.log("init");\n break;\n case "load":\n console.log("load", params);\n model = await SmartEmbedModel.load(smart_env, { adapter: "transformers", model_key: params.model_key, ...params });\n result = { model_loaded: true };\n break;\n case "embed_batch":\n if (!model)\n throw new Error("Model not loaded");\n result = await model.embed_batch(params.inputs);\n break;\n case "count_tokens":\n if (!model)\n throw new Error("Model not loaded");\n result = await model.count_tokens(params);\n break;\n default:\n throw new Error(
Unknown method: ${method});\n }\n return { id, result, iframe_id };\n } catch (error) {\n console.error("Error processing message:", error);\n return { id, error: error.message, iframe_id };\n }\n}\nprocessMessage({ method: "init" });\n';
@jagai thanks for sharing this š
PS- It will be easier to configure something like this without code in the future.
š“
@daaain @ArtificialAmateur While this isn't an ideal solution, I did manage to set up
nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.f32.gguf
on LM Studio to work in @brianpetro incredible smart-connections plugin.What I've done is essentially eliminate the checks on
api.openai.com
and instead just refactored it to direct to my local LM Studio server. Just beware that by doing so, you're taking away the ability to use the OpenAI embeddings, because we'll be refactoring the components that connect to them rather than adding on existing functionalities.This is a quick and dirty fix for those who'd rather handle the embeddings locally, and its far from ideal, but it works really great for my use case.
Enjoy! [ā¦]
I have tried your code, however during the embedding process, LM studio shows the below error:
2024-10-04 10:11:02 [DEBUG]
llama_decode_internal: n_tokens == 0
llama_decode: failed to decode, ret = -1
2024-10-04 10:11:02 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-04 10:11:02 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}
The .smart-env\multi
shows incomplete embedding as many files are only 1kb.
@daaain @ArtificialAmateur While this isn't an ideal solution, I did manage to set up
nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.f32.gguf
on LM Studio to work in @brianpetro incredible smart-connections plugin. What I've done is essentially eliminate the checks onapi.openai.com
and instead just refactored it to direct to my local LM Studio server. Just beware that by doing so, you're taking away the ability to use the OpenAI embeddings, because we'll be refactoring the components that connect to them rather than adding on existing functionalities. This is a quick and dirty fix for those who'd rather handle the embeddings locally, and its far from ideal, but it works really great for my use case. Enjoy! [ā¦]I have tried your code, however during the embedding process, LM studio shows the below error:
2024-10-04 10:11:02 [DEBUG] llama_decode_internal: n_tokens == 0 llama_decode: failed to decode, ret = -1 2024-10-04 10:11:02 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0 2024-10-04 10:11:02 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}
The
.smart-env\multi
shows incomplete embedding as many files are only 1kb.
I'll need a little bit more info on this if possible.
Could you share which embedding model did you try, along with the version of Smart Connections? I'll do my best to help
@jagai I am using smart-connections version 2.2.79 (although one literally just got pushed 2.2.80 but it doesn't affect our discussion).
This is my model loaded in LM Studio:
This is the obsidian settings:
And the main.js was edited exactly as you documented. I changed the tokens to 2048 in the object and the JSON later, thought maybe it would help, but didn't.
Here's a txt of the js: main.txt
The embedding error is happening on certain files, but it's hard to figure out the issue as I have lots of files and I couldn't reach a point where I got 0 errors yet.
EDIT: I have renamed the files, removed metadata, cleaned the texts removing all that break json (,.\/*? etc), still getting the same issue. So the issue is not due to the content of the notes.
@usernotnull that's cool, thanks for sharing š“
@jagai I am using smart-connections version 2.2.79 (although one literally just got pushed 2.2.80 but it doesn't affect our discussion).
This is my model loaded in LM Studio:
This is the obsidian settings:
And the main.js was edited exactly as you documented. I changed the tokens to 2048 in the object and the JSON later, thought maybe it would help, but didn't.
Here's a txt of the js: main.txt
The embedding error is happening on certain files, but it's hard to figure out the issue as I have lots of files and I couldn't reach a point where I got 0 errors yet.
EDIT: I have renamed the files, removed metadata, cleaned the texts removing all that break json (,./*? etc), still getting the same issue. So the issue is not due to the content of the notes.
Could you try switching on LM Studio to gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf
, give it another go and let me know how it goes?
@jagai unfortunately same issue:
2024-10-05 15:17:50 [INFO] Received request to embed multiple: ["A Folder > A Title\nBLOCK NOT FOUND (no line_start)"]
2024-10-05 15:17:50 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:17:50 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:17:50 [INFO] Received request to embed multiple: ["Another Folder> Another Title:\n---\nup: [\"[[Somewhere]]\"]\nrelated: []\ntags: [o..."]
2024-10-05 15:17:50 [DEBUG]
llama_decode_internal: n_tokens == 0
llama_decode: failed to decode, ret = -1
2024-10-05 15:17:50 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-05 15:17:50 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}
I also notice the issue with any local embedding model: BLOCK NOT FOUND (no line_start)
I went ahead and tested it on a sandbox vault, same issue:
2024-10-05 15:36:32 [INFO] Received request to embed multiple: ["Plugins make Obsidian special for you:\nWe started making Obsidian with plugins in mind because every..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Plugins make Obsidian special for you\nWe started making Obsidian with plugins in mind because everyo..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Plugins make Obsidian special for you\n## Wild community plugins\r\n\r\nPlugins not just give Obsidian mo..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Vault is just a local folder:\nDifferent than most note-taking apps out there, an Obsidian vault is n..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Vault is just a local folder\nDifferent than most note-taking apps out there, an Obsidian vault is no..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Start Here:\nHi, welcome to Obsidian!\n\n---\n\n## Iām interested in Obsidian\n\nFirst of all, tell me a li..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Start Here\nHi, welcome to Obsidian!\n\n---\n\n## Iām interested in Obsidian\n\nFirst of all, tell me a lit..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Start Here\n---\n\n## Iām interested in Obsidian\n\nFirst of all, tell me a little bit about what's your ..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Start Here\n## What is this place?\n\nThis is a sandbox vault in which you can test various functionali..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Adventurer > From plain-text note-taking:\nObsidian is similar to plain-text based note-taking apps i..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Adventurer > From plain-text note-taking\nObsidian is similar to plain-text based note-taking apps in..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Adventurer > From standard note-taking:\nGreat, that means you should already be familiar with taking..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Adventurer > From standard note-taking\nGreat, that means you should already be familiar with taking ..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Adventurer > No prior experience:\nThere are plenty of note-taking apps out there, so congratulations..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Adventurer > No prior experience\nThere are plenty of note-taking apps out there, so congratulations ..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33 [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33 [INFO] Received request to embed multiple: ["Formatting > Callout:\nAs of v0.14.0, Obsidian supports callout blocks, sometimes called \"admonitions..."]
2024-10-05 15:36:33 [DEBUG]
llama_decode_internal: n_tokens == 0
llama_decode: failed to decode, ret = -1
2024-10-05 15:36:33 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-05 15:36:33 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}
@usernotnull I couldn't reproduce the errors on my end. I'm not entirely sure its related to Obsidian or Smart Connections. Could be something to do with LM Studio, but I'm really not sure.
@usernotnull I couldn't reproduce the errors on my end. I'm not entirely sure its related to Obsidian or Smart Connections. Could be something to do with LM Studio, but I'm really not sure.
Which OS are you on? Mine is win11.
@usernotnull I couldn't reproduce the errors on my end. I'm not entirely sure its related to Obsidian or Smart Connections. Could be something to do with LM Studio, but I'm really not sure.
Which OS are you on? Mine is win11.
I'm on macOS Sequoia, using Obsidian with my Macbook Air M1... Would be even more difficult for me to help as I've never tried running Obsidian or LM Studio on Windows to be honest ā¹ļø
@usernotnull A long shot, but since you're on Windows, perhaps giving the mixedbread-ai/mxbai-embed-large-v1 model a shot might yield better results?
@usernotnull I've managed to narrow this down to LM Studio 0.3.3. For some reason it causes the models to fail embedding. Tested on LM Studio 0.3.2 and Smart Connections 2.2.81.
You can find LM Studio 0.3.2 at the bottom of the download page (https://lmstudio.ai/download).
Let me know if this works š
@usernotnull I've managed to narrow this down to LM Studio 0.3.3. For some reason it causes the models to fail embedding. Tested on LM Studio 0.3.2 and Smart Connections 2.2.81.
You can find LM Studio 0.3.2 at the bottom of the download page (https://lmstudio.ai/download).
Let me know if this works š
You did it š Thanks :)
Any idea if LM Studio is aware of this issue?
@usernotnull I've managed to narrow this down to LM Studio 0.3.3. For some reason it causes the models to fail embedding. Tested on LM Studio 0.3.2 and Smart Connections 2.2.81. You can find LM Studio 0.3.2 at the bottom of the download page (https://lmstudio.ai/download). Let me know if this works š
You did it š Thanks :)
Any idea if LM Studio is aware of this issue?
Glad it works! Woohoo š„³ I'm not sure LM Studio is aware of the issue though. Probably would be a good idea to let them know š
The LM Studio issue has been resolved. @jagai 's temporary workaround now works well for local embeddings.
Jumping off of #302
Like the local server options for Smart Chat, similar work can be done for embeddings.
The OpenAI format API (which LM Studio and Ollama support) is
/v1/embeddings