Closed axiomofjoy closed 10 months ago
There's already an open PR for embedding callback if I'm not mistaken #7920
Hi, @axiomofjoy! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, you requested the addition of callback support for embeddings in the LangChain library. You proposed a specific approach for implementation and mentioned that you are willing to contribute to the process. Another user, @ppramesi, mentioned that there is already an open pull request (#7920) for embedding callback. It seems like you reacted positively to this comment.
Before we close this issue, we wanted to check if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your understanding and contribution to the LangChain project!
Feature request
Add embedding support to the callback system. Here is one approach I have in mind.
on_embedding_start
method onCallbackManagerMixin
inlibs/langchain/langchain/callbacks/base.py
.EmbeddingManagerMixin
withon_embedding_end
andon_embedding_error
methods inlibs/langchain/langchain/callbacks/base.py
.Embeddings
abstract base class inlibs/langchain/langchain/embeddings/base.py
.libs/langchain/langchain/embeddings
as necessary.One minimally invasive approach would be:
embed_documents
,embed_query
,aembed_documents
, andaembed_query
methods on the abstractEmbeddings
base class that contain the embeddings callback hook. Add abstract methods_embed_documents
and_embed_query
methods and unimplemented_aembed_documents
and_aembed_query
methods to the base class.embed_documents
,embed_query
,aembed_documents
, andaembed_query
to_embed_documents
,_embed_query
,_aembed_documents
, and_aembed_query
.Motivation
Embeddings are useful for LLM application monitoring and debugging. I want to build a callback handler that enables LangChain users to visualize their data in Phoenix, an open-source tool that provides debugging workflows for retrieval-augmented generation. At the moment, it is not possible to get the query embeddings out of LangChain's callback system, for example, when using the
RetrievalQA
chain. Here is an example notebook where I sub-classOpenAIEmbeddings
to get out the embedding data:I would like the LangChain callback system to support this use-case.
This feature has been requested for TypeScript and has an open PR. An additional motivation is to maintain parity with the TypeScript library.
Your contribution
I am willing to implement, test, and document this feature with guidance from the LangChain team. I am also happy to provide feedback on an implementation by the LangChain team by building an example callback handler using the embeddings hook.