opensearch-project / opensearch-py

Python Client for OpenSearch
https://opensearch.org/docs/latest/clients/python/
Apache License 2.0
359 stars 179 forks source link

[BUG] http gzip compression cannot be used with the snapshot endpoints #507

Closed ofirt-orca closed 1 year ago

ofirt-orca commented 1 year ago

What is the bug?

http gzip compression cannot be used with the snapshot endpoints

How can one reproduce the bug?

set http_compress to True and try to restore an encrypted snapshot from s3 you will get the error Your request: '/_snapshot/cs-automated-enc/<snapshot_id>/_restore' is not allowed due to invalid input parameters.

What is the expected behavior?

That it will work and snapshot will be restored

What is your host/environment?

ubuntu 20.04

Do you have any screenshots?

No

Do you have any additional context?

Only when turning off the gzip compression it works well Other requests to the cluster seem to work fine (compressed or not) https://docs.aws.amazon.com/opensearch-service/latest/developerguide/gzip.html

saimedhi commented 1 year ago

Hello @ofirt-orca, I've reproduced the issue, but I'm not observing any errors. Please inform me if I've overlooked any details. Thanks :)

Screenshot 2023-09-27 at 3 48 20 PM Screenshot 2023-09-27 at 3 50 28 PM Screenshot 2023-09-27 at 3 52 15 PM
saimedhi commented 1 year ago

I'm closing this issue. If you believe it's still unresolved, please feel free to reopen it.

ofirt-orca commented 1 year ago

Hi @saimedhi, it is failing only when working with AWS OpenSearch service. Added code snippet. Can you reopen the issue?

    import boto3
    from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth

    creds = boto3.Session().get_credentials()
    awsauth = AWSV4SignerAuth(creds, "us-east-1", "es")

    host = "xxxx.us-east-1.es.amazonaws.com"
    repo_name = "cs-automated-enc"
    snapshot_id = "snapshot-id"
    body = {
        'indices': 'index_123', 
    }

    os = OpenSearch(
        hosts=[{"host": host, "port": 80}],
        connection_class=RequestsHttpConnection,
        use_ssl=False,
        verify_certs=False,
        timeout=30,
        retry_on_timeout=True,
        http_compress=True,
        http_auth=awsauth,
    )

    os.snapshot.restore(repository=repo_name, snapshot=snapshot_id, body=body)
saimedhi commented 1 year ago

@ofirt-orca, Could you please provide more details about the error you're encountering, and could you also share the complete code, including the snapshot creation part?

ofirt-orca commented 1 year ago

Sure, I'll try to provide code with creation soon, I'll need to create another OpenSearch cluster for that. I use the python client to work with AWS OpenSearch cluster and everything works well with or without http compression besides this endpoint. It might be that even snapshot creation will fail with the same error. This code works perfectly when http_compress is not set or False

ofirt-orca commented 1 year ago

I've chosen another approach for reproduction since creating a snapshot will require setting up a manual repository, s3 bucket, roles and more. Instead, it can be reproduced by trying to restore an existing automated snapshot that should exist by default on either cs-automated or cs-automated-enc repositories. Try this code, let me know if it worked.

    import boto3
    from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth

    host = "xxxx.us-east-1.es.amazonaws.com"
    creds = boto3.Session().get_credentials()
    awsauth = AWSV4SignerAuth(creds, "us-east-1", "es")

    os = OpenSearch(
        hosts=[{"host": host, "port": 80}],
        connection_class=RequestsHttpConnection,
        use_ssl=False,
        verify_certs=False,
        timeout=30,
        retry_on_timeout=True,
        http_compress=True,
        http_auth=awsauth,
    )

    repo_name = "cs-automated-enc"
    snapshots = os.snapshot.get(repository=repo_name, snapshot=["*"])
    snapshot = snapshots["snapshots"][0]
    snapshot_id = snapshot["snapshot"]
    index = snapshot["indices"][0]

    body = {
        'indices': index, 
        'include_global_state': False, 
    }

    print(f"Restoring index {index} from snapshot {snapshot_id}")
    os.snapshot.restore(repository=repo_name, snapshot=snapshot_id, body=body)
ofirt-orca commented 1 year ago

Hi @saimedhi, were you able to reproduce the issue using the code snippet?

ofirt-orca commented 1 year ago

@saimedhi is there any plan to address this?

dblock commented 1 year ago

@ofirt-orca I don't think anyone is working on this. Am I understanding correctly that the issue is with Amazon OpenSearch Service, or is it a client-side problem? If it's the former (AOS), then we/you should open a ticket with Amazon. If it's a client problem help narrow it down and let us help you fix it here?

ofirt-orca commented 1 year ago

@dblock seems like a server side bug of OpenSearch service. Would you mind opening an issue for them?

dblock commented 1 year ago

I sent it to the right people, but it would be most effective if you could open a support ticket for your domain details.

Since this is AOS and not a client problem I'm going to close it here.

jed326 commented 1 year ago

@ofirt-orca I can confirm this is an issue with Amazon OpenSearch Service and a fix will be rolled out in a future release. Please follow up with AWS customer support for any additional information. Thanks!