Open xuzeyu91 opened 8 months ago
I'm not familiar with nomic, but if it's based on the BERT architecture it's supported in LLamaSharp yet. BERT support was only added to llama.cpp a couple of weeks ago (https://github.com/ggerganov/llama.cpp/pull/5423), and we haven't updated our binaries yet.
However, I feel that the returned float array is not correct. When I use the same text for vector matching, the similarity is only 0.42
Do you mean you literally fed the same text in twice at it wasn't identical? If so that's definitely a bug!
I have the same issue using the phi-2 and llama models through the integration of semantickernel. The values returned from the 'memory' seems to be completely independent to the search value. And I have the same issue I put in an exact match for the search.
I experienced the same issue with the poor similarity matching with Semantic Kernel. Once LlamaSharp updates the binaries to support the Bert models, this issue should go away.
Why will the update of Bert models help? @AshD could you expand the issue should go away?
What kind of model should be used for embedding? When I use nomic-embed-text-v1.5.f32.gguf, it will report protected memory, while when I use tinyllama-1.1b-chat.gguf, it can run normally. However, I feel that the returned float array is not correct. When I use the same text for vector matching, the similarity is only 0.42