Closed spaasis closed 2 months ago
To clarify - the question relates more to "what is the actual embedding API call sent" than debugging the error message from the API itself. Since the manual curl call works I believe there's just a slight difference in the sent data
Good catch on the trailing slash! That's an Ollama specific issue since we're doing some additional URL construction. Pushed a fix to resolve that.
Thanks for verifying the curl works on your end. We don't have any debug logs for embeddings at the moment unfortunately. Your best bet would probably be to run Continue locally using our https://github.com/continuedev/continue/blob/main/CONTRIBUTING.md guidelines and set some debug breakpoints.
From the error message you have, does it seem like there is anything unusual in the chunk that is getting embeded?
I'll check to local run on Monday, but here's the raw (file name changed, but the slashed are as they were) logs and curl for one file embedding. It does seem that every single file fails, so I doubt it's related to the chunk contents:
Log:
[2024-07-27T08:08:13] Failed to generate embedding for D:/Code/Project\.editorconfig with provider: OllamaEmbeddingsProvider::nomic-embed-text:latest:
Error: Failed to embed chunk: {"detail":[{"type":"model_attributes_type","loc":["body"],"msg":"Input should be a valid dictionary or object to extract fields from",
"input":"{\"model\":\"nomic-embed-text:latest\",\"prompt\":\"# Remove the line below if you want to inherit .editorconfig settings from higher directories\\r\\nroot = true\\r\\n\\r\\n# All files\\r\\n[*]\\r\\ncharset = utf-8\\r\\n# indent_size intentionally not specified in this section.\\r\\nindent_style = space # Use soft tabs (spaces) for indentation.\\r\\ninsert_final_newline = false\\r\\ntrim_trailing_whitespace = true\\r\\n\\r\\n# ReSharper properties\\r\\nresharper_wrap_array_initializer_style = chop_if_long\\r\\nresharper_wrap_object_and_collection_initializer_style = chop_if_long\\r\\n\\r\\n# JSON files\\r\\n[*.json]\\r\\nindent_size = 2\\r\\n\\r\\n# Markdown files\\r\\n[*.md]\\r\\nindent_size = 2\\r\\ntrim_trailing_whitespace = false\\r\\n\\r\\n# PowerShell scripts\\r\\n[*.ps1]\\r\\nindent_size = 4\\r\\n\\r\\n[*.{xml,xsd}]\\r\\nmax_line_length = off\\r\\nend_of_line = lf\\r\\nindent_size = 2\\r\\n\\r\\n# Visual Studio XML project files\\r\\n[*.{csproj,vcxproj,vcxproj.filters,proj,projitems,shproj}]\\r\\nindent_size = 2\\r\\nmax_line_length = off\\r\\nend_of_line = lf\\r\\n\\r\\n# Visual Studio and .NET related XML config files\\r\\n[*.{props,targets,ruleset,config,nuspec,resx,vsixmanifest,vsct}]\\r\\nindent_size = 2\\r\\nmax_line_length = off\\r\\nend_of_line = lf\\r\\n\\r\\n# YAML files\\r\\n[*.{yml,yaml}]\\r\\nindent_size = 2\\r\\n\\r\\n# C# files\\r\\n[*.{cs,cshtml}]\\r\\n\"}"}]}
Curl:
curl -X 'POST' \
'http://sykeai:8080/ollama/api/embeddings' \
-H 'accept: application/json' \
-H 'Authorization: Bearer eyJ...' \
-H 'Content-Type: application/json' \
-d '{"model":"nomic-embed-text:latest","prompt":"# Remove the line below if you want to inherit .editorconfig settings from higher directories\\r\\nroot = true\\r\\n\\r\\n# All files\\r\\n[*]\\r\\ncharset = utf-8\\r\\n# indent_size intentionally not specified in this section.\\r\\nindent_style = space # Use soft tabs (spaces) for indentation.\\r\\ninsert_final_newline = false\\r\\ntrim_trailing_whitespace = true\\r\\n\\r\\n# ReSharper properties\\r\\nresharper_wrap_array_initializer_style = chop_if_long\\r\\nresharper_wrap_object_and_collection_initializer_style = chop_if_long\\r\\n\\r\\n# JSON files\\r\\n[*.json]\\r\\nindent_size = 2\\r\\n\\r\\n# Markdown files\\r\\n[*.md]\\r\\nindent_size = 2\\r\\ntrim_trailing_whitespace = false\\r\\n\\r\\n# PowerShell scripts\\r\\n[*.ps1]\\r\\nindent_size = 4\\r\\n\\r\\n[*.{xml,xsd}]\\r\\nmax_line_length = off\\r\\nend_of_line = lf\\r\\nindent_size = 2\\r\\n\\r\\n# Visual Studio XML project files\\r\\n[*.{csproj,vcxproj,vcxproj.filters,proj,projitems,shproj}]\\r\\nindent_size = 2\\r\\nmax_line_length = off\\r\\nend_of_line = lf\\r\\n\\r\\n# Visual Studio and .NET related XML config files\\r\\n[*.{props,targets,ruleset,config,nuspec,resx,vsixmanifest,vsct}]\\r\\nindent_size = 2\\r\\nmax_line_length = off\\r\\nend_of_line = lf\\r\\n\\r\\n# YAML files\\r\\n[*.{yml,yaml}]\\r\\nindent_size = 2\\r\\n\\r\\n# C# files\\r\\n[*.{cs,cshtml}]\\r\\n"}'
Curl response:
{
"embedding": [
1.3733816146850586,
1.6329410076141357,
-1.99997079372406,
-0.805694580078125,
-0.4809744954109192,
----
]
}
I had the same issue. I thought Content-Type: application/json
is default and set automatically, but it seems this isn't the case. So, I added it and now it works.
I can see that you're also missing the Content-Type
in your embeddingsProvider
part in your config.json, but you included it in your curl request. Maybe thats the reason why your curl request works, but your config not.
That's it, thanks @simonoscr ! I also figured it was set automatically since the other configs didn't need it, but adding Content-Type fixed it.
Working config:
"embeddingsProvider": {
"provider": "ollama",
"model": "nomic-embed-text:latest",
"apiBase": "http://XX.XX.XX.XX:8080/ollama/",
"requestOptions": {
"headers": {
"Authorization": "Bearer eyJ...",
"Content-Type": "application/json"
}
}
},
I added #1855 to hopefully fix this by default
Thank you @spaasis and @simonoscr for seeing this through to completion! Appreciate the PR to fix the behavior for everyone.
Before submitting your bug report
Relevant environment info
Description
Hi! I'm testing Continue integration to our Open WebUI instance https://github.com/open-webui/open-webui
I got all the other pieces working, but the API calls to /embeddings fail. See the log entry.
However, if I do the API call manually, it passes and returns embeddings:
I tried to eye the source code to figure out what is different in the manual API call vs the one that Continue makes, but couldn't find a difference. Is there a debug log I could turn on to see the actual API calls?
Btw the
apiBase
configuration forembeddingsProvider
requires a trailing/
but the other model configurations don't ;) Before I spotted this I got a bunch of "method not allowed"-messages.Let me know if I can help you debug further. Loving Continue so far!
To reproduce
Open any project and start indexing
Log output