Description: Bedrock retrieve API returns extra metadata in the response which is currently not returned in the retriever response
Issue: The change adds the metadata from bedrock retrieve API response to the bedrock retriever in a backward compatible way. Renamed metadata to sourceMetadata as metadata term is being used in the Document already. This is in sync with what we are doing in llama-index as well.
Dependencies: No
[X] Add tests and docs:
Added unit tests
Notebook already exists and does not need any change
Updated and ran integration tests
Response from end to end testing, just to ensure backward compatibility: [Document(page_content='Exoplanets.', metadata={'location': {'s3Location': {'uri': 's3://bucket/file_name.txt'}, 'type': 'S3'}, 'score': 0.46886647, 'source_metadata': {'x-amz-bedrock-kb-source-uri': 's3://bucket/file_name.txt', 'tag': 'space', 'team': 'Nasa', 'year': 1946.0}})]
[X] Lint and test: Run make format, make lint and make test from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/
Corresponding PR in langchain: https://github.com/langchain-ai/langchain/pull/21349
[X] PR title: "community: Add source metadata to bedrock retriever response"
[X] PR message:
[X] Add tests and docs:
[Document(page_content='Exoplanets.', metadata={'location': {'s3Location': {'uri': 's3://bucket/file_name.txt'}, 'type': 'S3'}, 'score': 0.46886647, 'source_metadata': {'x-amz-bedrock-kb-source-uri': 's3://bucket/file_name.txt', 'tag': 'space', 'team': 'Nasa', 'year': 1946.0}})]
[X] Lint and test: Run
make format
,make lint
andmake test
from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/