olivere / elastic

Deprecated: Use the official Elasticsearch client for Go at https://github.com/elastic/go-elasticsearch
https://olivere.github.io/elastic/
MIT License
7.41k stars 1.15k forks source link

gzip compressed json response from elastisearch truncated by olivere/elasti? #1544

Open gc06131114 opened 3 years ago

gc06131114 commented 3 years ago

Which version of Elastic are you using?

[ ] elastic.v6 (for Elasticsearch 6.8)

Please describe the expected behavior

I was doing a stress test for the search api, and got an error: "unexpected end of JSON input" at here: https://github.com/olivere/elastic/blob/release-branch.v6/search.go#L623 while the client was decoding the es result, which was expected to be well formed json.

I captured the packets with tcpdump while the stress testing was going on, and analyzed the packets with gopacket (https://github.com/google/gopacket, specifically this example: https://github.com/asmcos/goexample/blob/master/examples/httpassembly.go)) , and found the response body were all well formed json.

some details that might help:

  1. the truncated message was 299524 bytes
  2. the content-length header of es respons was 38936 bytes, and the response body was gzip compressed
  3. the decompressed body is 320187 bytes
  4. the maxResponseSize was set to be 1024 * 1024 : https://github.com/olivere/elastic/blob/release-branch.v6/search.go#L42

I see the code that read response with ioutil.ReadAll: https://github.com/olivere/elastic/blob/release-branch.v6/response.go#L49

I wonder is there any situation that ioutil.ReadAll might only read partial response body and return? or is there any other limit on the size that can be read besides the maxResponseSize limit?

thank you so much!

olivere commented 3 years ago

I've never seen a problem like this in the last 8 years.

Do you connect to ES directly, or is there anything in between that could modify the request/response as well?

gc06131114 commented 3 years ago

The client is deployed as a container and is connected to es directly.

and I guess this might have something to do timeout..

I do the search with: Do(context.WithTimeout(context.Background(), 500*time.Millisecond))

and I logged the request/response time for those json decode errors, most of which are above 500ms (there is one 410ms), so when timeout reached, the reading process will be stopped? If this is caused by timeout, I thought it would be a timeout error instead of a json decode error..

p.s. timeout samples I collected (milliseconds) 410, 502, 509, 519, 520, 589, 593, 600, 600, 603, 689, 699, 799, 909, 5897, 6003, 6004, 6009, 6107, 6198, 6210, 6213, 9621, 9906, 14555, 14605, 14698, 14704, 14783, 14787, 14790, 14795, 14996, 15339, 15486, 15996, 20797, 20800, 20806

olivere commented 3 years ago

Interesting. If you ask the context to have a timeout with context.WithTimeout, the context should be canceled. You can check with elastic.IsContextErr(err). The elastic.IsTimeout(...) only occurs if ES sends a timeout (server-side, so to say).

If the error is a context error, we mustn't rely on the response body to be valid. But there might be dragons. It's an interesting case.