olivere / elastic

Deprecated: Use the official Elasticsearch client for Go at https://github.com/elastic/go-elasticsearch
https://olivere.github.io/elastic/
MIT License
7.4k stars 1.15k forks source link

Get a document return 400 bad request if id contains both comma and slash #1186

Open iceking2001 opened 5 years ago

iceking2001 commented 5 years ago

Which version of Elastic are you using?

[x] elastic.v6 (for Elasticsearch 6.x)

Please describe the expected behavior

Get and update a document successed if its id contains both comma and slash.

Please describe the actual behavior

The es return 400 bad request error.

Any steps to reproduce the behavior?

Insert a document by bulk api , which id is "c7ccd3c6f4a7: <Michael.Lee@chi.com/O=, @gg.hop.com", then retrieves this document by GetService api, print its request likes:

GET /index/_doc/c7ccd3c6f4a7:%20%20%3CMichael.Lee@chi.com/O=,%20@gg.hop.com?routing=102345 

the return is:

{"error":"no handler found for uri [/index/_doc/c7ccd3c6f4a7:%20%20%3CMichael.Lee@chi.com/O=,%20@gg.hop.com?routing=102345] and method [GET]"}

I notice that if the version <= v6.2.13, there is no such issue. With compare between v6.2.13 and v6.2.14, found that some modify in aws_v4.go cause the problem.

req.URL.Scheme = "https"
if strings.Contains(req.URL.RawPath, "%2C") {
    // Escaping path
    req.URL.RawPath = url.PathEscape(req.URL.RawPath)
}

If I comment out this code, it works. It has a correct request, and with the correct url encoding:

GET /index/_doc/c7ccd3c6f4a7%3A%20%20%3CMichael.Lee%40chi.com%2FO%3D%2C%20%40gg.hop.com?routing=102345 

So i am curious about what this code does, if comments out them, what will happen?

olivere commented 5 years ago

I will have to review why that is. Maybe it was a temporary problem we were trying to fix. Seems wrong to me as well: fiddling with the URL encoding.

olivere commented 5 years ago

It came in with #962, and was carried over from this client. Can you check if that hack is still required?

iceking2001 commented 4 years ago

Sorry for the late replay. I check three case without this hack:

  1. contains both comma and slash
  2. only have comma
  3. only have slash

All of above don't get 403 Forbidden error from aws.