brendanhay / amazonka

A comprehensive Amazon Web Services SDK for Haskell.
https://amazonka.brendanhay.nz
Other
599 stars 227 forks source link

Use `decodeUtf8'` to check whether bytestrings are printable before logging #952

Closed korayal closed 1 year ago

korayal commented 1 year ago

closes #947

on my local tests targeting localstack, below are the results when I hit the PutMetricData endpoint via a gzipped wrapper

[Client Request] {
  host      = localhost:4566
  secure    = False
  method    = POST
  target    = Nothing
  timeout   = ResponseTimeoutMicro 300000000
  redirects = 0
  path      = /
  query     =
  headers   = x-amz-content-sha256: e769f604d04a6c9f5f05218ea59e10ae8e728e64cff3dc4a2ca47b0b41c453e8; x-amz-date: 20230731T124639Z; host: localhost:4566; content-encoding: gzip; content-type: application/x-www-form-urlencoded; charset=utf-8; authorization: AWS4-HMAC-SHA256 Credential=localstack_key_id/20230731/us-east-1/monitoring/aws4_request, SignedHeaders=content-encoding;content-type;host;x-amz-content-sha256;x-amz-date, Signature=22efa655b91759f5deba281860a38ae9560065cfc2f16be198e51a4a95b93098
  body      = non-printable 365 strict bytes
}

or, when the data is printable (no gzipped payload):

[Client Request] {
  host      = localhost:4566
  secure    = False
  method    = POST
  target    = Nothing
  timeout   = ResponseTimeoutMicro 300000000
  redirects = 0
  path      = /
  query     =
  headers   = x-amz-content-sha256: efbf2bd7e9a1e0cdf2ee1b29225678f489d9d1d9baed20dde625d0faf56472fd; x-amz-date: 20230731T125748Z; host: localhost:4566; content-type: application/x-www-form-urlencoded; charset=utf-8; authorization: AWS4-HMAC-SHA256 Credential=localstack_key_id/20230731/us-east-1/monitoring/aws4_request, SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=47dfedcd0d705427f27f78b8640026ee254a622bf6e973eb01d973af2b2caf81
  body      = Action=GetMetricStatistics&Dimensions.member.1.Name=test_id&Dimensions.member.1.Value=test10&EndTime=2023-07-31T12%3A57%3A48Z&MetricName=instrument-test&Namespace=test-namespace&Period=60&StartTime=2023-07-31T12%3A57%3A46Z&Statistics.member.1=SampleCount&Version=2010-08-01
}
korayal commented 1 year ago

Can you please add an entry to lib/amazonka/CHANGELOG.md, which is the master changelog for the project?

I've updated the changelog and added a section Unreleased at the top. would that work?

korayal commented 1 year ago

Also, are we sure that this is the correct check to make? What if the user's terminal is set to some other encoding scheme? A TextEncoding (as returned by System.IO.hGetEncoding) isn't even in Eq, so I'm not sure what the best way is to fix this. I suspect what we have here might be a bit too naive; do you have any thoughts about other ways we could check this? It may also be the case that we successfully decode UTF-8 but it's got non-printable characters in it; do we want to do that check here too?

I've added another check with isPrint, where it should ensure that whatever we're trying to print contains printable characters. At this point I think assuming "the request body (if not compressed) is a valid utf-8 text" should be fine since it's what we're sending to AWS. The part that prints it out is still left to Build.lazyByteString / Build.byteString, so at worst case we're keeping the existing behavior.

In the medium term, I probably want to extract logging from the main amazonka library so that people can use whatever logging framework suits them. Then handling non-printable stuff becomes the log library's concern.

that'd be great.

korayal commented 1 year ago

@endgame are we OK with the latest state of this PR?