opensearch-project / opensearch-java

Java Client for OpenSearch
Apache License 2.0
110 stars 171 forks source link

[BUG] code works with local opensearch but not AOSS #1068

Closed al-niessner closed 5 days ago

al-niessner commented 2 weeks ago

What is the bug?

Using OpenSearchClient.get() to retrieve document. Works with a local opensearch (docker opensearch:2.13.0) but same code fails AOSS (amazon serverless) with 404.

How can one reproduce the bug?

Have example class that can be provided if this is the correct forum.

What is the expected behavior?

I except the same OpenSearchClient.get() to pass or fail independent of local opensearch or AOSS.

What is your host/environment?

Local: docker opensearch 2.13.0 AOSS: unknown

Do you have any screenshots?

No. Would not make sense without seeing the code. Here is the output from the test: local: Index already exists so delete it first local: Created the testing index local: Found expected document local: Found expected document local: Test complete aoss: Index already exists so delete it first aoss: Created the testing index aoss: Found expected document Exception in thread "main" org.opensearch.client.opensearch._types.OpenSearchException: Request failed: [http_exception] server returned 404 at org.opensearch.client.transport.aws.AwsSdk2Transport.parseResponse(AwsSdk2Transport.java:498) at org.opensearch.client.transport.aws.AwsSdk2Transport.executeSync(AwsSdk2Transport.java:396) at org.opensearch.client.transport.aws.AwsSdk2Transport.performRequest(AwsSdk2Transport.java:193) at org.opensearch.client.opensearch.OpenSearchClient.get(OpenSearchClient.java:737) at example.ComparisonLocalVsServerless.doTest(ComparisonLocalVsServerless.java:110) at example.ComparisonLocalVsServerless.main(ComparisonLocalVsServerless.java:79)

Do you have any additional context?

We use cognito to exchange general username/password for access keys. We then use a lambda to convert those keys to AWS credentials because Java SDK does (did not) support cognito at the time of our testing. This lookup is included in the test code.

It seems that the failure seems to be related to the local opensearch using httpclient5 while the AOSS uses httpclient. The document ID that cannot be found contains many : in its name, like a:b:c:d:e::1.0.

@jordanpadams @tloubrieu-jpl

dblock commented 2 weeks ago

OpenSearch has a concept of refresh, but AOSS doesn't support that. You have to wait for a refresh cycle to complete to make the document searchable.

To test, add a sleep of a few seconds or a retry between the document creation and document fetch?

I am going to close this as I'm 99% sure this is not a client problem and the document ID is a red herring, but if you can show that it works with curl and doesn't work with opensearch-java we can of course reopen.

tloubrieu-jpl commented 2 weeks ago

Hi @dblock ,

I am a colleague of @al-niessner who is working part-time for a couple of weeks.

The sleep accounts for the issue that we are experiencing, but once being set aside, by adding sleep or by pushing the documents and getting them in different commands, we are still having the issue.

What sounds to be the cause of the issue is the special characters that we are having in our OpenSearch document identifiers, for example a:b:c:d:e::1.0 which might be encoded at some point by the opensearch-java library before being sent to OpenSearch, then Opensearch is not able to retrieve it. Everything works as expected on our side with a document id like a_b_c_d_e__1.2.

I am guessing the reason why it does not work with the OpenSearch serverless AOSS but works with a local OpenSearch might be the underlying HTTP transport (AwsSdk2Transport) objects that are only used for AOSS connection.

Could you re-open this bug since the issue related to the refresh is not the one we want to resolve here.

dblock commented 2 weeks ago

Ok, I'll reopen. Help narrow it down? Show the underlying HTTP request/responses being made by the client?

tloubrieu-jpl commented 1 week ago

Thanks @dblock. Unfortunately we don t have the logs activated on our opensearch serverless. I will see how that can be done and I ´ll get back to you.

dblock commented 1 week ago

I mean client-side logs of which actual HTTPs requests / responses are made.

In https://github.com/dblock/opensearch-java-client-demo/ I have an example where you can turn on DEBUG wire logging and you'll see all the transport messages.

-Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.SimpleLog \
-Dorg.apache.commons.logging.simplelog.log.org.apache.http.wire=DEBUG
workmanw commented 1 week ago

Hello, we also have the same issue. The bug appears to be in AwsSdk2Transport.parseResponse. The errorDeserializer for the endpoint is using an incorrect format and the real exception, which is being suppressed, is:

Missing required property 'ErrorResponse.error'
org.opensearch.client.util.MissingRequiredPropertyException: Missing required property 'ErrorResponse.error'
    at org.opensearch.client.util.ApiTypeHelper.requireNonNull(ApiTypeHelper.java:89)
    at org.opensearch.client.opensearch._types.ErrorResponse.<init>(ErrorResponse.java:71)
    at org.opensearch.client.opensearch._types.ErrorResponse.<init>(ErrorResponse.java:56)
    at org.opensearch.client.opensearch._types.ErrorResponse$Builder.build(ErrorResponse.java:158)
    at org.opensearch.client.opensearch._types.ErrorResponse$Builder.build(ErrorResponse.java:121)
    at org.opensearch.client.json.ObjectBuilderDeserializer.deserialize(ObjectBuilderDeserializer.java:92)
    at org.opensearch.client.json.DelegatingDeserializer$SameType.deserialize(DelegatingDeserializer.java:55)
    at org.opensearch.client.transport.aws.AwsSdk2Transport.parseResponse(AwsSdk2Transport.java:539)
    at org.opensearch.client.transport.aws.AwsSdk2Transport.executeSync(AwsSdk2Transport.java:401)
    at org.opensearch.client.transport.aws.AwsSdk2Transport.performRequest(AwsSdk2Transport.java:201)
    at org.opensearch.client.opensearch.OpenSearchClient.get(OpenSearchClient.java:737)
    at com.stplus.core.search.service.UserIndexerService.getUserEntry(UserIndexerService.kt:43)
    at com.stplus.core.search.handler.PubSubHandler.upsertUserInSearchIndex(PubSubHandler.kt:47)
    ...
dblock commented 1 week ago

@workmanw A (failing) unit test and maybe a fix would be great.

workmanw commented 1 week ago

@dblock I didn't see any existing test support for using AwsSdk2Transport in a unit test, so I wrote the reproduction as a failing integration test. This does fail for me locally when using an AOSS instance:

./gradlew integrationTest --tests "*AwsSdk2*" -Dtests.awsSdk2support.domainHost=....
dblock commented 1 week ago

Yep, we have https://github.com/opensearch-project/opensearch-java/issues/503. Could use any help you can spare!

tloubrieu-jpl commented 1 week ago

Hi @dblock ,

Sorry for the delay, I was able to add the logs you asked for. Our test script is getting that:

[DEBUG] wire - http-outgoing-0 >> "GET /tloubrieutest/_doc/a:b:c:d:e::1.2?_source_includes=product_class HTTP/1.1[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "Host: p5qmxrldysl1gy759hqf.us-west-2.aoss.amazonaws.com[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "Accept-Encoding: gzip[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "Authorization: AWS4-HMAC-SHA256 Credential=ASIAWNMVG7HD2RHFXCMS/20240711/us-west-2/aoss/aws4_request, SignedHeaders=accept-encoding;host;x-amz-date;x-amz-security-token, Signature=548a816b84df8284ecaa279445a3a9b8d0a804f2253f85ff5d391da5bc3a7b86[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "X-Amz-Date: 20240711T022015Z[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "X-Amz-Security-Token: IQoJb3JpZ2luX2VjEIv//////////wEaCXVzLXdlc3QtMiJHMEUCIQCYFsR1G6tyQyft0W5iyaMsj6/A7B4gHmKGtLpgvWogzQIgaK6fUhEMguuE9xrqsilgzrSwblet2F8dncyMwKZZYzYqxAQIUxABGgw0NDEwODM5NTE1NTkiDAAsfm/ADK+3S6mtuyqhBGNu3D6NYxjHjkQ3KycQRi02TA4vEmo/nAHM5bqJtrtSGXKWpS6yerJ7lLM0CzpBNhQqKBQAUahmc2Bf2x0sks1xw/ZRVr8LJECOrln3d+NniO2ysq7GTmZz5Nd6FKnJYwtR7H1OGuhO9upsfxDA0Oie94FNFp38LIeY1+xPcXb1mA2a2KZj/IGQFXoK9OMZqg3L9OisMHGVffoXwL67+4INuFXE0GEO4Lz49ZWWLV5YKW+M12vV00r20UY3n89BjCQCEE/wLTL/tvVocL1qSlUEUmEa3wrfIxgij7QPIKDD1M4BQXwSx5r3obdzo5eEYcFsVG3HaN9gPF9KHRAmlJLE4fd1wOY8mZ0lzo/LjGB87otcYMQy3ppDdGEgu1SH7/iaoqECg/mzgTQwWVLhLovZZUTwqy2wwoejLHC92329+gPMEpwehJsoFokHqYKckiUmLoA4AyU+NXi1JoTiu/kS9d3II3S60+e5ExpUqsHs1CKEVBm7SedjyXz1Kiy6YXhzDxInlVL+a0rNw0tqmCPTu+U4ZCpDYYGFIae8/Q6iWReFP9Y49JtkXh2IFvdICy6jFsVGqYGgZXf9GM3FqSZ2TPktOJTVC7LdxjP/rLxMFyZ06iD3oXa1EBhAXrhVqYn7wxfAUWnY9A3bcoEqI5244ykQWhW9KCZYgtBaAi6ipZaqNC1jT0AXg4sU/8aSpbF3OnPDBpjzmphsAuyMGfidMN6CvbQGOoUCqRTvYiCDftSoPf9gGugdxpqpoMp85ulRj621pziHdNSLj4Xzkah1WoznMMSPE9xUil/p6bsRDd0f3bIcDmw2ffJrM5dbjOOb+f+DUsfehZPQGthKbunuduhlaFpEm9DePIOa62l6GtyZXlFpZTxyUoA1gkwZKU+Ou0ceu53LnvcW1WpE9GC2s6+1gSh3NcIqZa/ybwkeCmCttIm0ZvUzfFOfVAh9euRQ5XMntM7MRuGH1h4SuL3GZ+7nH94u29rQhys5j/70lc6OsrzoGfRwcbBX6klQPngZkBVCoOzXTI9lE8NMLx1EOqSiVtKzBji1KTTQHkAavCkQXLKjlxLyvBxJeoLE[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "Connection: Keep-Alive[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "User-Agent: [\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "[\r][\n]"
[DEBUG] wire - http-outgoing-0 << "HTTP/1.1 404 Not Found[\r][\n]"
[DEBUG] wire - http-outgoing-0 << "x-request-id: 73403f0b-c95a-93ba-8fbe-04240f2d10d0[\r][\n]"
[DEBUG] wire - http-outgoing-0 << "date: Thu, 11 Jul 2024 02:20:15 GMT[\r][\n]"
[DEBUG] wire - http-outgoing-0 << "server: aoss-amazon[\r][\n]"
[DEBUG] wire - http-outgoing-0 << "content-length: 0[\r][\n]"
[DEBUG] wire - http-outgoing-0 << "[\r][\n]"
Exception in thread "main" org.opensearch.client.opensearch._types.OpenSearchException: Request failed: [http_exception] server returned 404
    at org.opensearch.client.transport.aws.AwsSdk2Transport.parseResponse(AwsSdk2Transport.java:498)
    at org.opensearch.client.transport.aws.AwsSdk2Transport.executeSync(AwsSdk2Transport.java:396)
    at org.opensearch.client.transport.aws.AwsSdk2Transport.performRequest(AwsSdk2Transport.java:193)
    at org.opensearch.client.opensearch.OpenSearchClient.get(OpenSearchClient.java:737)
    at gov.nasa.pds.registry.common.ComparisonLocalVsServerless.doTest(ComparisonLocalVsServerless.java:160)
    at gov.nasa.pds.registry.common.ComparisonLocalVsServerless.main(ComparisonLocalVsServerless.java:87)

Whereas a document with this id is available in the index:

image
tloubrieu-jpl commented 6 days ago

Hello, any updates on that issue ?

On my side I made a test with the Delete request and we are having the same 404 error.

I tested this piece of code:

    // Delete document with id with special characters
    DeleteRequest deleteRequest =
        new DeleteRequest.Builder().index(this.index).id(doc_a.get("lidvid")).build();
    DeleteResponse dr = client.delete(deleteRequest);
    System.out.println("Delete status,  " + dr.result());

And I am getting similar errors as the one given for the get method, see:

[DEBUG] wire - http-outgoing-0 >> "DELETE /tloubrieutest/_doc/a:b:c:d:e::1.2 HTTP/1.1[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "Host: p5qmxrldysl1gy759hqf.us-west-2.aoss.amazonaws.com[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "Accept-Encoding: gzip[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "Authorization: AWS4-HMAC-SHA256 Credential=ASIAWNMVG7HDSSUEDTPH/20240715/us-west-2/aoss/aws4_request, SignedHeaders=accept-encoding;host;x-amz-date;x-amz-security-token, Signature=9a8e5d9c78f62430ea19d5867c2c144ab4652c3278018673a5cde3ab6964ca48[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "X-Amz-Date: 20240715T225633Z[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "X-Amz-Security-Token: IQoJb3JpZ2luX2VjEP///////////wEaCXVzLXdlc3QtMiJHMEUCIDrH4ocP4UVmtekjI5kf6Ec1qUM3WSjFA4g+qgU7YnfSAiEAxZxfcj3uEAF7Z70a/HkKrUt2MDrm+GTcHNjrFCMViJIqzQQIyP//////////ARABGgw0NDEwODM5NTE1NTkiDLymPqU/VKNuwNU5ZSqhBHZxFXJ3XD5UDq7+7ICSbawAXbGPg4Ga1pK3ttIVAf7OsxsYPE9q9A1zBulKL9mwxeogo8pG99ggKAkLlkTrIuvhWGBXYs/xtHUmB/nQ9cNCVWKHld5XYugRgqwdQVulY+E8UsASC1Cb1kY4FPiY1uTiQtNcoaB/8isFT4DMmaPt4I/H6gvYskUXMCjPTDRg59bHp3n0HOjMSNQ7vVDXZe61l9Qyrn6/n9/4+kE2XYXp8RqJXJPnLOJCaYDBEja+ZVEUmrgq02RQucptWqGp5joCh3BvwVtD1x01PWW0uDroKRF3qP3JVwNwPyR1FBpRlG4k5YJINMngHMNIfxH7otip7dbMNAaRhs8FRK4WfE+3sAJIdR7hmxzMabDC/tHOGXQUDZKkjgRhReZqhsK6OPv6lpsppxN97cSXmkZflvtmaEY9zO6j1syKRjiGEqrCyv6tYGquFBR+Y5mCRmQYzpLA9nGtCt+DLNAmAysOUTx+F3bZ0lW7T7M2GmikS8BjUFGJqMIasjg/8qzBLI8mHiUFM4mn2+khtMsIwFs3zZ0QcL8f2f58DaQDGvzO9UVbwahbf5e7a6gZvvA0ItlsTy30Ywk3ESazXp9ROj/77meUS69BVy5df+1qugL4rXE3n5xdGVoU0cqSUtwOzXc7NstgdtEkTumTOhc1ZoEMVffLbr8qZpFshwdKDuJZJ9931Zy5t/oIyHawf7Mq/9bG9hYOMJ7S1rQGOoUCtDtnIl6+SkMqHYDG3s0AOKUY/kMQKvYj+hrVMePw8+cDBFPtof/yHetRRAVL8o0a0CW5UntfikOiwgZ0QiYJvtQbe+ib7g0YacyZ3GEPCu2s57AIvkaZE/Dwzz6NG7Du2XdH+mBUhs9fBSzf/B8bp3+XTuy4GA7BvmWcyTrdr6YpC5syz6Ibke30v6slWuU/IlboQgpJgkI5bfrwCbLLB8RRZnBPKnIdJ4xBnIZUULAXCLDA2Plx6wXMzjWHF8QKeqpAsoq134zHQwHArPxu1etPJaSyv7hlPvE70pj/0TjgDNjAzM/HEV3dmzVfsijOjbxYte+EIQaz2g6jAt4jqgXKZKMi[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "Connection: Keep-Alive[\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "User-Agent: [\r][\n]"
[DEBUG] wire - http-outgoing-0 >> "[\r][\n]"
[DEBUG] wire - http-outgoing-0 << "HTTP/1.1 404 Not Found[\r][\n]"
[DEBUG] wire - http-outgoing-0 << "x-request-id: ef0b1ed5-dd1b-985e-8379-18eb837414b5[\r][\n]"
[DEBUG] wire - http-outgoing-0 << "date: Mon, 15 Jul 2024 22:56:31 GMT[\r][\n]"
[DEBUG] wire - http-outgoing-0 << "server: aoss-amazon[\r][\n]"
[DEBUG] wire - http-outgoing-0 << "content-length: 0[\r][\n]"
[DEBUG] wire - http-outgoing-0 << "[\r][\n]"
Exception in thread "main" org.opensearch.client.opensearch._types.OpenSearchException: Request failed: [http_exception] server returned 404
    at org.opensearch.client.transport.aws.AwsSdk2Transport.parseResponse(AwsSdk2Transport.java:498)
    at org.opensearch.client.transport.aws.AwsSdk2Transport.executeSync(AwsSdk2Transport.java:396)
    at org.opensearch.client.transport.aws.AwsSdk2Transport.performRequest(AwsSdk2Transport.java:193)
    at org.opensearch.client.opensearch.OpenSearchClient.delete(OpenSearchClient.java:418)
    at gov.nasa.pds.registry.common.ComparisonLocalVsServerless.doTest(ComparisonLocalVsServerless.java:177)
    at gov.nasa.pds.registry.common.ComparisonLocalVsServerless.main(ComparisonLocalVsServerless.java:90)

We have a workaround for the get() query by using search() instead, but I don't see a workaround for the delete(). This makes this bug much more critical to us.

Thanks

dblock commented 6 days ago

This could be related, https://github.com/opensearch-project/opensearch-java/issues/827.

Someone still needs to debug, narrow this down to an incorrect request being made, and fix the issue. I can't promise to get to it quickly.

workmanw commented 6 days ago

FWIW I have debugged this already. Apologies if I wasn't clear above, the problem is that AwsSdk2Transport has different response handling than ApacheHttpClient5Transport.

If you look here: ApacheHttpClient5Transport.java you'll see an exception handler for MissingRequiredPropertyException, this is what ensures that the GetResponse object is deserilaized correctly.

Notice that that is missing from AwsSdk2Transport.java.

So I would be happy to fix this issue if you would accept it as having an integration test instead of a unit test. I would be happy to try to write a unit test, but to be honest I don't have to expertise to take on a large test refactor like you mentioned with #503.

dblock commented 6 days ago

@workmanw Yes, would definitely merge with just the integration test.

workmanw commented 5 days ago

I think this issue can be closed now. Happy searching!

jordanpadams commented 5 days ago

@workmanw is there a specific version or SNAPSHOT we should be using to get these latest updates?

al-niessner commented 5 days ago

What maven artifact will contain these changes?

On Wed, Jul 17, 2024, 08:38 Daniel (dB.) Doubrovkine < @.***> wrote:

Closed #1068 https://github.com/opensearch-project/opensearch-java/issues/1068 as completed.

— Reply to this email directly, view it on GitHub https://github.com/opensearch-project/opensearch-java/issues/1068#event-13536971977, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIUBIU6UEDZZCQWMTE5KIDZMZXWTAVCNFSM6AAAAABKNSH3JCVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJTGUZTMOJXGE4TONY . You are receiving this because you were mentioned.Message ID: <opensearch-project/opensearch-java/issue/1068/issue_event/13536971977@ github.com>

workmanw commented 5 days ago

👋 @jordanpadams @al-niessner I'm not sure, I'm just a consumer of of OpenSearch like you all. I can poke around and see if there are snapshot artifacts that get published anywhere. At Zillow we have our own artifactory instance and I just published a local build there for the time being.

dblock commented 5 days ago

Released builds go to https://mvnrepository.com/artifact/org.opensearch.client/opensearch-java and daily SNAPSHOTS go to https://aws.oss.sonatype.org/content/repositories/snapshots/org/opensearch/client/opensearch-java/, so 2.11-SNAPSHOT should be good. I include it in a pom here as an example.

However, looks like the snapshot publication has been intermittently failing, so I am not sure this particular change made it yet, would grok through these builds and see if there's a successful one after the backport into the 2.x branch. https://github.com/opensearch-project/opensearch-java/actions/workflows/publish-snapshots.yml. I opened https://github.com/opensearch-project/opensearch-java/issues/1090, not sure what's going on.

I'd appreciate if someone trying this out could add a section to DEVELOPER_GUIDE or expand https://github.com/opensearch-project/opensearch-java/blob/main/README.md#snapshot-builds on how to use SNAPSHOT builds and where they are per above.

jordanpadams commented 5 days ago

thanks @workmanw and @dblock we will have a look. As @dblock mentioned, looks like the SNAPSHOT builds are failing right now, but we will look to build the code locally and go from there.

tloubrieu-jpl commented 5 days ago

Thanks very much @workmanw I was able to test the latest code with our failing cases on get and delete methods and it now works.

@dblock, for integrating the latest code in our maven project, I manually build the opensearch-java jar with gradlew and integrated it in my local maven repository using your maven pom file for the opensearch-java dependencies.

For the code to work though, I had to add 2 additional dependencies of opensearch-java which were missing in your proposed pom:

<!-- https://mvnrepository.com/artifact/jakarta.json/jakarta.json-api -->
<dependency>
    <groupId>jakarta.json</groupId>
    <artifactId>jakarta.json-api</artifactId>
    <version>2.1.3</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.eclipse.parsson/parsson -->
<dependency>
    <groupId>org.eclipse.parsson</groupId>
    <artifactId>parsson</artifactId>
    <version>1.1.5</version>
</dependency>

I guess you should add them somewhere on your side.

At last, Ideally, we would like to have a stable version of opensearch-java with the fix to integrate in our own tool stable version. But I guess you will start solving the issue with the SNAPSHOT publication.

Thanks,

Thomas

workmanw commented 5 days ago

@tloubrieu-jpl Awesome. Thanks for confirming it's fixed 🎉

dblock commented 4 days ago

@tloubrieu-jpl open an issue to get a release for v. next, we’ll get on it