Open tomersagi opened 3 days ago
ok, I understand the problem now. The embedding model I am using is returning a k x F tensor, with k being the number of tokens in the query phrase and F being the number of features. The chroma huggingface embedding function is expecting a 1xF tensor only. To solve it I had to subclass the embedding function and add a mean pooling step.
Perhaps the documentation and error message can be improved here to describe the types of models this embedding function supports.
@tomersagi, you are right that the naming is a bit misleading. Under the hood, we use sentence-transformers. Technically, it also works with transformer models only and defaults to mean pooling, and without normalization.
We can do better by letting the user know that the model they are loading is not a sentence-transformer one, which may produce unsupported output.
What happened?
Hi, I am trying to use a custom embedding model using the huggingfaceAPI. I am following the instructions from here
However, when I try to use the embedding function I get the following error:
Minimal example:
Versions
Chroma 0.5.3 Python 3.11
Relevant log output