deepset-ai / haystack-core-integrations

Additional packages (components, document stores and the likes) to extend the capabilities of Haystack version 2.0 and onwards
https://haystack.deepset.ai
Apache License 2.0
111 stars 110 forks source link

Cannot connect to aws opensearch serverless #1159

Open adhikari23 opened 3 months ago

adhikari23 commented 3 months ago

Describe the bug Cannot connect to aws opensearch serverles. Here is the code snippet.

from haystack_integrations.document_stores.opensearch import OpenSearchDocumentStore
from haystack_integrations.components.embedders.amazon_bedrock import AmazonBedrockTextEmbedder
from requests_aws4auth import AWS4Auth
from opensearchpy import OpenSearch, RequestsHttpConnection
from boto3 import Session
from haystack import Pipeline

service = 'aoss'
credentials = Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key,
                   "us-east-1", service, session_token=credentials.token)

embedder = AmazonBedrockTextEmbedder(model="amazon.titan-embed-text-v1")
docstore = OpenSearchDocumentStore(
    hosts = <opensearch serverless endpoint>,
    index = "jps-test-index",
    http_auth= awsauth,
    timeout=300,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection,
    engine = "faiss"
    # return_embedding=True

)
query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", embedder)
query_pipeline.add_component("retriever", OpenSearchEmbeddingRetriever(document_store=docstore))
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")

query = "How many languages are there?"

result = query_pipeline.run({"text_embedder": {"text": query}})

print(result['retriever']['documents'][0])

Error message

  File "/root/bosai/genai/eric-bosaiapps-genai-poc/haystack-demo/venv/lib/python3.10/site-packages/opensearchpy/transport.py", line 416, in perform_request
    status, headers_response, data = connection.perform_request(
  File "/root/bosai/genai/eric-bosaiapps-genai-poc/haystack-demo/venv/lib/python3.10/site-packages/opensearchpy/connection/http_requests.py", line 241, in perform_request
    self._raise_error(
  File "/root/bosai/genai/eric-bosaiapps-genai-poc/haystack-demo/venv/lib/python3.10/site-packages/opensearchpy/connection/base.py", line 315, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
opensearchpy.exceptions.NotFoundError: NotFoundError(404, '')

Expected behavior Retrieved documents

Additional context Add any other context about the problem here, like document types / preprocessing steps / settings of reader etc.

To Reproduce Steps to reproduce the behavior

FAQ Check

System:

davidsbatista commented 3 months ago

Hi @adhikari23 , I formatted your code posting to allow for a better reading.

First thing is that it seems that you are missing an import:

from haystack_integrations.components.retrievers.opensearch import OpenSearchEmbeddingRetrieve

Looking at your error message it seems that there's a problem connection to your OpenSearch server. Can you connect to it in isolation, i.e.: outside of haystack?

adhikari23 commented 3 months ago

Yes, I am able to connect to the opensearch server outside haystack.