Open pradeepdev-1995 opened 5 days ago
Great feedback @pradeepdev-1995 -- @justin-cechmanek will be able to chime in to answer your questions on how this works today. However, we are rapidly working on an update to the semantic cache class and feedback like this is very helpful.
As a quick starting point, checkout the newly included CustomTextVectorizer
class as a way to manage this better.
Hi @pradeepdev-1995 I'll get straight to your questions:
1 - You can directly pass the vector of your choice to llmcache.store()
as a parameter. llmcache.store(prompt='<text>', vector=[1.0,2.0,3.0], response='<text>')
. This will ensure your specified vector is used for vector comparisons. The text prompt is still stored.
2 - We don't impose any vector size limits. Vectors should be the same size for a given search index.
3 - Similar to llmcache.store()
you can directly pass your own vector to llmcache.check()
. If you do so then you don't need to pass a text prompt. llmcache.check(vector=[1.0,2.0,3.0])
works.
As @tylerhutcherson mentioned, if you want to use your own vectorizer instead of one of the several we support, you can use the CustomTextVectorizer
class to wrap your embedding function and then pass this vectorizer to the SemanticCache
constructor and it will handle embedding of prompts in store()
and check()
.
Is configuring the distance metric to something other than cosine similarity a feature you would like to have implemented?
@justin-cechmanek Thanks for the detailed anwer.
One clarification regarding CustomTextVectorizer
class.
I am already using huggingface embedding models for embedding creation outside of the redis semantic cache using the corresponding native library and models.
So those generated embeddings I can directly store (llmcache.store()
) and able to search(llmcache.check()
) like you mentioned above right?
Then why I should use CustomTextVectorizer
class here?
CustomTextVectorizer
is required as the cache builds an index schema around the vectorizer.
Without it, it's possible to pass your own vector to check()
and store()
but they must be the same dimension as embeddings generated by the default vectorizer, which is currently 768 dimensions.
# without CustomTextVectorizer
cache = SemanticCache() # default vectorizer is created internally
prompt_1 = 'your prompt 1'
vector_1 = my_embed_function(prompt_1)
response = llm_call(prompt_1)
cache.store(prompt=prompt_1, response=response, vector=vector_1) # fails if vector_1 has wrong dimensions
prompt_2 = 'your prompt 2'
vector_2 = my_embed_function(prompt_2)
res = cache.check(vector=vector_2) # fails if vector_2 has wrong dimensions
The recommended approach is:
# with CustomVTextVectorizer
custom_vectorizer = CustomTextVectorizer(embed=my_embed_function)
cache = SemanticCache(vectorizer=custom_vectorizer)
prompt_1 = 'your prompt 1'
response = llm_call(prompt_1)
cache.store(prompt=prompt_1, response=response)
prompt_2 = 'your prompt 2'
res = cache.check(prompt_2)
I am following the official documentation scrips for semantic cache. In the following code
I have these questions,
1 - In llmcache.store() hope I can store the custom vector of the prompt directly rather than prompt. That custom vector can be generated by any embedding model like sentence transformer, minilm-l12-v2, openai embeddings, huggingface embeddings...etc 2 - Is there any embedding length limitation to store using llmcache.store()? Shall I use a vector with any length? 3 - In llmcache.check() hope we can pass the vector( from sentence transformer, minilm-l12-v2, openai embeddings, huggingface embeddings,..etc) directly for the semantic matching purpose,rather than the query 4 - inside llmcache.check() which distance measure using for finding the semantic similarity ? is it cosine similarity or any other? do we have the privilege to configure that?