AWS Bedrock Knowledge Bases support custom user metadata per source document. This change adds that metadata to AmazonKnowledgeBasesRetriever Documents metadata as source_metadata.
I'm open to suggestions on the source_metadata key name, but I was avoiding metadata["metadata"]
There are a few different ways to handle this. In this PR, source_metadata is set to None if the Bedrock KB result does not contain a metadata key. I prefer this approach as the presence of the source_metadata key is guaranteed for the caller - i.e., the caller does not have to check for it's presence.
If it's preferred to not set source_metadata if it's empty, then here are a couple of options.
for result in results:
metadata={
"location": result["location"],
"score": result["score"] if "score" in result else 0,
}
if "metadata" in result:
metadata["source_metadata"] = result["metadata"]
documents.append(
Document(
page_content=result["content"]["text"],
metadata=metadata,
)
)
Requires Python 3.9+
for result in results:
documents.append(
Document(
page_content=result["content"]["text"],
metadata={
"location": result["location"],
"score": result["score"] if "score" in result else 0} |
({'source_metadata': result["metadata"] } if "metadata" in result else {}),
)
)
I updated the integration tests to reflect these changes and added a response that includes KB metadata.
I changed the integration test get_relevant_documents to invoke based on
/home/knewell/.cache/pypoetry/virtualenvs/langchain-aws-eO7Ij7mw-py3.9/lib/python3.9/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The method BaseRetriever.get_relevant_documents was deprecated in langchain-core 0.1.46 and will be removed in 0.3.0. Use invoke instead.
warn_deprecated(
AWS Bedrock Knowledge Bases support custom user metadata per source document. This change adds that metadata to AmazonKnowledgeBasesRetriever Documents metadata as
source_metadata
.This addresses the feature request in https://github.com/langchain-ai/langchain/discussions/20649
I'm open to suggestions on the
source_metadata
key name, but I was avoidingmetadata["metadata"]
There are a few different ways to handle this. In this PR,
source_metadata
is set toNone
if the Bedrock KB result does not contain ametadata
key. I prefer this approach as the presence of thesource_metadata
key is guaranteed for the caller - i.e., the caller does not have to check for it's presence.If it's preferred to not set
source_metadata
if it's empty, then here are a couple of options.Requires Python 3.9+
I updated the integration tests to reflect these changes and added a response that includes KB metadata.
I changed the integration test
get_relevant_documents
toinvoke
based onTwo simple typos in README.md were fixed.