Closed morenosoldev closed 5 months ago
๐ค
Hey @andreasjoensson! Great to see you back here. Hope you're doing awesome! ๐
Yes, it is possible to pass source documents when using an agent with a createRetrieverTool. You can use the getRelevantDocuments
method from the BaseRetrieverInterface
to retrieve the relevant documents based on the provided query. Here is the relevant code:
const func = async (
{ query }: { query: string },
runManager?: CallbackManagerForToolRun
) => {
const docs = await retriever.getRelevantDocuments(
query,
runManager?.getChild("retriever")
);
return formatDocumentsAsString(docs);
};
In this code, getRelevantDocuments
is called with the query and a child of the runManager (if it exists). The documents returned by this method are then formatted as a string using formatDocumentsAsString(docs)
.
You can find more details about this in the langchain/src/tools/retriever.ts file.
Also, there was a similar issue discussed in the langchainjs repository: How to retrieve sourceDocuments with LCEL and RunnableSequence chains, is this the correct approach?. The solution suggested there was to implement custom logic in the _getRelevantDocuments
method to ensure that the documents returned are indeed relevant to the query.
I hope this helps! If you have any other questions, feel free to ask.
@dosu-bot
Okay i have done that but i am still not seing the source documents in the output, here what would you modify to have source documents showing here?
`import { Response } from 'express' import { AIMessage, HumanMessage } from '@langchain/core/messages'
async function processChatbotInput(
executor: any,
input: string,
res: Response,
previousMessages: (HumanMessage | AIMessage)[]
): PromiseStarting agent: ${event.name} with input: ${JSON.stringify( event.data.input )}
)
}
} else if (eventType === 'on_chain_end') {
if (event.name === 'Agent') {
console.log('\n-----')
console.log(Finished agent: ${event.name}\n
)
console.log(Agent output was: ${event.data.output}
)
console.log('event.data', event.data)
console.log('\n-----')
return event.data.output
}
} else if (eventType === 'on_llm_stream') {
const content = event.data?.chunk?.message?.content
if (content !== undefined && content !== '') {
console.log(| ${content}
)
res.write(content)
}
} else if (eventType === 'on_tool_start') {
console.log('\n-----')
console.log(
Starting tool: ${event.name} with inputs: ${event.data.input}
)
} else if (eventType === 'on_tool_end') {
console.log('\n-----')
console.log(Finished tool: ${event.name}\n
)
console.log(Tool output was: ${event.data.output}
)
console.log('\n-----')
}
}
return ''
} catch (error) { console.error('Error occurred:', error) throw error } }
export default processChatbotInput `
๐ค
Based on the context provided, it seems like you want to display the source documents in the output of your chatbot application. To achieve this, you need to modify the getRelevantDocuments
method in your subclass of BaseRetriever
to include the source documents in the returned DocumentInterface
objects.
Here's an example of how you might modify the getRelevantDocuments
method:
class MyRetriever extends BaseRetriever {
// ...
async _getRelevantDocuments(
query: string,
_callbacks?: CallbackManagerForRetrieverRun
): Promise<DocumentInterface[]> {
// Your code to retrieve documents goes here. This is just an example.
const documents = await myDocumentRetrievalFunction(query);
// Modify the DocumentInterface objects to include the source document.
const documentsWithSource = documents.map(doc => ({
...doc,
sourceDocument: myFunctionToGetSourceDocument(doc),
}));
return documentsWithSource;
}
// ...
}
In this example, myDocumentRetrievalFunction
is a placeholder for your function that retrieves the documents, and myFunctionToGetSourceDocument
is a placeholder for your function that retrieves the source document for a given DocumentInterface
object. Please replace these placeholders with your actual functions.
The DocumentInterface
object in the langchainjs codebase has two properties:
pageContent
: This is a string that represents the content of the page.metadata
: This is a generic type that extends a Record of string keys and any value. It represents additional data related to the document.Here is the structure:
export interface DocumentInterface<
// eslint-disable-next-line @typescript-eslint/no-explicit-any
Metadata extends Record<string, any> = Record<string, any>
> {
pageContent: string;
metadata: Metadata;
}
This structure is implemented in the Document
class, where pageContent
is initialized with the pageContent
from the fields
parameter if it exists, otherwise it uses the existing pageContent
. The metadata
is initialized with the metadata
from the fields
parameter if it exists, otherwise it uses an empty object.
You can find more information about this in the langchain-core/src/retrievers.ts and langchain-core/src/documents/document.ts files in the langchainjs repository.
I hope this helps! If you have any other questions, feel free to ask.
Hey i can't seem to return the docs meta data, is this not possible in the new versions or do i seem to be missing something? Any help would be appreciated, thanks.
Hi, @andreasjoensson,
I'm helping the langchainjs team manage their backlog and am marking this issue as stale. From what I understand, you were inquiring about adding source documents to an agent when using a createRetrieverTool. There was a response providing guidance on modifying the getRelevantDocuments
method to include source documents in the returned DocumentInterface
objects, and you followed up with a code snippet and a question about returning the docs metadata.
Could you please confirm if this issue is still relevant to the latest version of the langchainjs repository? If it is, kindly let the langchainjs team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your understanding and cooperation.
Hey i was wondering if there, is a way to pass source documents when using a agent with a createRetrieverTool like this?
`import { ChatOpenAI } from '@langchain/openai' import { Chroma } from '@langchain/community/vectorstores/chroma' import { AgentExecutor } from 'langchain/agents' import { createRetrieverTool } from 'langchain/tools/retriever' import orderCancellationTool from './tools_functions/OrderCancellation' import refundTool from './tools_functions/RefundOrder' import shippingTool from './tools_functions/Shipping' import missingItemsTool from './tools_functions/MissingItems' import { ChatPromptTemplate, MessagesPlaceholder } from '@langchain/core/prompts' import { convertToOpenAIFunction } from '@langchain/core/utils/function_calling' import { RunnableSequence } from '@langchain/core/runnables' import { formatToOpenAIFunctionMessages } from 'langchain/agents/format_scratchpad' import { OpenAIFunctionsAgentOutputParser } from 'langchain/agents/openai/output_parser'
async function createChatbotExecutor( company: string, model: ChatOpenAI, chatbotId: string, conversationId: string, prompt: string, vectorStore: Chroma ) { try { const retriever = vectorStore.asRetriever()
} catch (error) { console.error('Error occurred:', error) throw error } }
export default createChatbotExecutor `