Open ToshihikoMakita opened 3 years ago
Wow, what an issue report -- 5⭐.
unicode and python seems to be a unique kind of pandora box.
to the effect the the query comes back empty can you audit what is the query received by the ES and does it match to the one received when curl (when using open access)?
OK. I will investigate.
Hi, when Elasticsearch domain allows open-access, I could capture the JSON data by using curl
on Ubuntu and Wireshark
. The main point is to set SSLKEYLOGFILE
environment variables before launching curl
.
https://everything.curl.dev/usingcurl/tls/sslkeylogfile
See attached query-json.txt.
Does awscurl
support SSLKEYLOGFILE
environment variable? If it is supported, I can send you the JSON dump file.
I think verbose flag "-v" will print the request before it's sent.
On Wed, Mar 17, 2021 at 8:01 PM Toshihiko Makita @.***> wrote:
Hi, when Elasticsearch domain allows open-access, I could capture the JSON data by using curl on Ubuntu and Wireshark. The main point is to set SSLKEYLOGFILE environment variables before launching curl.
See attached query-json.txt.
query-json.txt https://github.com/okigan/awscurl/files/6161008/query-json.txt
Does awscurl support SSLKEYLOGFILE environment variable? If it is supported, I can send you the JSON dump file.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/okigan/awscurl/issues/106#issuecomment-801580155, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADUYXWHEBBLZ4PKEVRK573TEFUG3ANCNFSM4ZFTAWGQ .
Unfortunately "-v" option does not display the contents of JSON specified "-d" option. I got the suggestion from AWS technical support to use Cloud Watch to debug the query that sent to Elasticsearch Service. Here I attach several pattern of the test results.
curl
when the ES domain is openCommand-line: search-ngram-and-kuromoji-open-access-curl-cmd.txt CloudWatch log: search-ngram-and-kuromoji-open-access-curl-cloud-watch.txt
awscurl
when the ES domain access needs IAM role (without specifying -data-binary
)Command-line: search-ngram-and-kuromoji-iam-role-access-awscurl-no-data-binary-cmd.txt CloudWatch log: search-ngram-and-kuromoji-iam-role-access-awscurl-no-data-binary-cloud-watch.txt
awscurl
when the ES domain access needs IAM role (specifying -data-binary
⇒on above Last challenge: fixing "UnicodeEncodeError")Command-line: search-ngram-and-kuromoji-iam-role-access-awscurl-data-binary-cmd-fail.txt CloudWatch log: search-ngram-and-kuromoji-iam-role-access-awscurl-data-binary-cloud-watch-fail.txt
From this result, I found that the JSON file specified -d
parameter should be treated as UTF-8 encoded when -data-binary
is specified. I have added several changes to awscurl.py
and now it works fine.
Command-line: search-ngram-and-kuromoji-iam-role-access-awscurl-data-binary-cmd-success.txt
CloudWatch log: search-ngram-and-kuromoji-iam-role-access-awscurl-data-binary-cloud-watch-success.txt
I will submit a pull-request with this fix. But this pull-request will not compatible with #90 because this one seems to use true binary data that should be uploaded to S3.
Please take a look at my pull-request and consider how to handle both UTF-8 encoded JSON and real binary data with awscurl
-d
parameter.
Regards,
I had a related issue (trying to send a request that contains 0xff, which is an invalid byte in any UTF-8 sequence). I think that I fixed it by making this change to awscurl.py (line 490 in the current head):
Original:
with open(filename, "r") as post_data_file:
New:
with open(filename, "rb" if args.data_binary else "r") as post_data_file
It works fine if domain allows open-access
I'm confident that my query works fine if Elasticsearch domain allows open-access and I use
curl
command to issue query.I attached the search query: search-search-ngram-and-kuromoji-2.json zipped. search-search-ngram-and-kuromoji-2.zip
If I launch
awscurl
without specifying--data-binary
, no search result returned.I have changed domain access policy to exhibit open access and allow one IAM role named ESFullAccess. Also ESFullAccess has trusted IAM user called ESProgram.
The command-line Windows Power shell
awsescurl.ps1
zipped.awsescurl.zip
If I launch
awscurl
with specifying--data-binary
, following error occurs.The command-line Windows Power shell
awsescurl-db.ps1
zipped.awsescurl-db.zip
Fixing TypeError
I rarely know Python, but from error message "TypeError: Unicode-objects must be encoded before hashing", I've modified following code from:
https://github.com/okigan/awscurl/blob/master/awscurl/awscurl.py line 200
to
because hashing payload should be done with encoding UTF-8 if
--data-binary
is specified or not.However modified awscurl still reports following error:
Last challenge: fixing "UnicodeEncodeError"
According to the error message, I modified the following code:
https://github.com/okigan/awscurl/blob/master/awscurl/awscurl.py line 135
to:
The error message has been vanished, but the query returns nothing.
Do you have any ideas to fix this problem?
Regards,