softwaremill / elasticmq

In-memory message queue with an Amazon SQS-compatible interface. Runs stand-alone or embedded.
https://softwaremill.com/open-source/
Apache License 2.0
2.55k stars 195 forks source link

IllegalArgumentException: Couldn't find header X-Amz-Target #898

Closed btalbot closed 11 months ago

btalbot commented 11 months ago

Latest (as of right now anyway) version of aws-sdk-sqs ruby client version 1.66.0 seems to no longer supply the X-Amz-Target header ... at least elasticmq cannot find it.

I've tested using docker the elasticmq and elasticmq-native images and they behave the same. Reverting the aws-sdk-sqs ruby client to 1.65.0 and it works. The problematic aws-sdk-sqs ruby client version 1.66.0 works fine with real SQS on AWS so I doubt that AWS will consider this a bug.

The bug is trivial to trigger in ruby with anything that makes a request to sqs (activemq)

require 'aws-sdk-sqs'
Aws::SQS::Client.new(endpoint:'http://activemq.local').list_queues

The activemq server stack trace

02:53:37.167 [elasticmq-pekko.actor.default-dispatcher-5] ERROR o.e.r.s.TheSQSRestServerBuilder$$anon$1 - Exception when running routes
java.lang.IllegalArgumentException: Couldn't find header X-Amz-Target
    at org.elasticmq.rest.sqs.directives.AnyParamDirectives.$anonfun$extractActionFromHeader$4(AnyParamDirectives.scala:32)
    at scala.Option.getOrElse(Option.scala:201)
    at org.elasticmq.rest.sqs.directives.AnyParamDirectives.$anonfun$extractActionFromHeader$1(AnyParamDirectives.scala:32)
    at org.apache.pekko.http.scaladsl.server.Directive$SingleValueTransformers$.$anonfun$map$1(Directive.scala:203)
    at org.apache.pekko.http.scaladsl.server.Directive.$anonfun$tmap$2(Directive.scala:96)
    at org.apache.pekko.http.scaladsl.server.directives.BasicDirectives.$anonfun$textract$2(BasicDirectives.scala:173)
    at org.apache.pekko.http.scaladsl.server.directives.BasicDirectives.$anonfun$mapRouteResultWith$2(BasicDirectives.scala:86)
    at org.apache.pekko.http.scaladsl.server.directives.BasicDirectives.$anonfun$textract$2(BasicDirectives.scala:173)
    at org.apache.pekko.http.scaladsl.server.directives.ExecutionDirectives.$anonfun$handleExceptions$2(ExecutionDirectives.scala:42)
    at org.apache.pekko.http.scaladsl.server.directives.BasicDirectives.$anonfun$textract$2(BasicDirectives.scala:173)
    at org.apache.pekko.http.scaladsl.server.directives.FutureDirectives.$anonfun$onComplete$3(FutureDirectives.scala:47)
    at org.apache.pekko.http.scaladsl.util.FastFuture$.$anonfun$transformWith$1(FastFuture.scala:45)
    at org.apache.pekko.http.scaladsl.util.FastFuture$.strictTransform$1(FastFuture.scala:49)
    at org.apache.pekko.http.scaladsl.util.FastFuture$.transformWith$extension(FastFuture.scala:53)
    at org.apache.pekko.http.scaladsl.util.FastFuture$.transformWith$extension(FastFuture.scala:45)
    at org.apache.pekko.http.scaladsl.server.directives.FutureDirectives.$anonfun$onComplete$2(FutureDirectives.scala:47)
    at org.apache.pekko.http.scaladsl.server.directives.BasicDirectives.$anonfun$textract$2(BasicDirectives.scala:173)
    at org.apache.pekko.http.scaladsl.server.directives.BasicDirectives.$anonfun$mapRouteResultWith$2(BasicDirectives.scala:86)
    at org.apache.pekko.http.scaladsl.server.directives.BasicDirectives.$anonfun$textract$2(BasicDirectives.scala:173)
    at org.apache.pekko.http.scaladsl.server.directives.ExecutionDirectives.$anonfun$handleExceptions$2(ExecutionDirectives.scala:42)
    at org.apache.pekko.http.scaladsl.server.Route$.$anonfun$createAsyncHandler$1(Route.scala:127)
    at org.apache.pekko.stream.impl.fusing.MapAsyncUnordered$$anon$31.onPush(Ops.scala:1443)
    at org.apache.pekko.stream.impl.fusing.GraphInterpreter.processPush(GraphInterpreter.scala:555)
    at org.apache.pekko.stream.impl.fusing.GraphInterpreter.processEvent(GraphInterpreter.scala:506)
    at org.apache.pekko.stream.impl.fusing.GraphInterpreter.execute(GraphInterpreter.scala:400)
    at org.apache.pekko.stream.impl.fusing.GraphInterpreterShell.runBatch(ActorGraphInterpreter.scala:662)
    at org.apache.pekko.stream.impl.fusing.GraphInterpreterShell$AsyncInput.execute(ActorGraphInterpreter.scala:532)
    at org.apache.pekko.stream.impl.fusing.GraphInterpreterShell.processEvent(ActorGraphInterpreter.scala:637)
    at org.apache.pekko.stream.impl.fusing.ActorGraphInterpreter.org$apache$pekko$stream$impl$fusing$ActorGraphInterpreter$$processEvent(ActorGraphInterpreter.scala:813)
    at org.apache.pekko.stream.impl.fusing.ActorGraphInterpreter$$anonfun$receive$1.applyOrElse(ActorGraphInterpreter.scala:831)
    at org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547)
    at org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545)
    at org.apache.pekko.stream.impl.fusing.ActorGraphInterpreter.aroundReceive(ActorGraphInterpreter.scala:729)
    at org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590)
    at org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557)
    at org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:280)
    at org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:241)
    at org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:253)
    at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
    at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
    at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
    at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
    at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177)
btalbot commented 11 months ago

This appears to be related to https://github.com/softwaremill/elasticmq/issues/812 which is marked as closed and fixed in elasticmq 1.4.

The ruby aws-sdk-sqs client does indeed switch to using application/x-amz-json-1.0 while the previous (working) version uses application/x-www-form-urlencoded content-type.

Below are two wire captures of the requests.

Working request using aws-sdk-sqs 1.65.0

POST / HTTP/1.1
Accept-Encoding: 
Content-Type: application/x-www-form-urlencoded; charset=utf-8
User-Agent: aws-sdk-ruby3/3.186.0 ua/2.0 api/sqs#1.65.0 os/macos#22 md/x86_64 lang/ruby#3.2.2 md/3.2.2 cfg/retry-mode#legacy
Host: sqs.nemo
X-Amz-Date: 20231109T033528Z
X-Amz-Content-Sha256: 48a38266faf90970d6c7fea9b15e6ba366e5f6397c2970fc893f8a7b5e207bd0
Authorization: AWS4-HMAC-SHA256 Credential=FakeKey/20231109/us-nemo-1/sqs/aws4_request, SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=7c43dceff9053ab8f766ad8ea0cc2e7ca871bb32f1bfad1f72ca6f21b2ff774c
Content-Length: 36
Accept: */*

Action=ListQueues&Version=2012-11-05

Broken request using aws-sdk-sqs 1.66.0

POST / HTTP/1.1
Accept-Encoding: 
Content-Type: application/x-amz-json-1.0
X-Amz-Target: AmazonSQS.ListQueues
User-Agent: aws-sdk-ruby3/3.186.0 ua/2.0 api/sqs#1.66.0 os/macos#22 md/x86_64 lang/ruby#3.2.2 md/3.2.2 cfg/retry-mode#legacy
Host: sqs.nemo
X-Amz-Date: 20231109T034347Z
X-Amz-Content-Sha256: 44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a
Authorization: AWS4-HMAC-SHA256 Credential=FakeKey/20231109/us-nemo-1/sqs/aws4_request, SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date;x-amz-target, Signature=072de1de606d8081c64e67bef6eab9cff99866bd1c693b91b1f3794d8d9258cc
Content-Length: 2
Accept: */*

{}
lbroman commented 11 months ago

Had the same issue using the AWS SDK for javascript v3. Reverted to the soon to be deprecated v2 and it works fine. I have not yet figured out in which version of v3 that amazon dropped the X-Amz-Target header.

btalbot commented 11 months ago

I think I've found the root of the problem. The issue is that elasticmq, when looking for the X-Amz-Target, header is making a case sensitive comparison here: https://github.com/softwaremill/elasticmq/blob/b31c8a6dd1f7097650b7294ce77044849290c622/rest/rest-sqs/src/main/scala/org/elasticmq/rest/sqs/directives/AnyParamDirectives.scala#L25C11-L25C11

The HTTP 1.1 spec generally expects http header NAMES to be case insensitive and thus many proxies may alter the headers during filtering and serialization which is what the envoy proxies running in our kubernetes systems are doing. https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/header_casing

I believe that the proper fix is to simply make the search of the x-amz-target header be case insensitive as the check for the x-amzn-trace-id header is here https://github.com/softwaremill/elasticmq/blob/b31c8a6dd1f7097650b7294ce77044849290c622/rest/rest-sqs/src/main/scala/org/elasticmq/rest/sqs/directives/AnyParamDirectives.scala#L38C32-L38C32

I've verified experimentally (via curl) that Amazon SQS is case-insensitive in accepting the header names.

[edit] included link to x-amzn-trace-id header which uses equalsIgnoreCase

micossow commented 11 months ago

Thanks for reporting the issue. I'll try to look into this within a few days

rubnogueira commented 11 months ago

I think I've found the root of the problem. The issue is that elasticmq, when looking for the X-Amz-Target, header is making a case sensitive comparison here: https://github.com/softwaremill/elasticmq/blob/b31c8a6dd1f7097650b7294ce77044849290c622/rest/rest-sqs/src/main/scala/org/elasticmq/rest/sqs/directives/AnyParamDirectives.scala#L25C11-L25C11

The HTTP 1.1 spec generally expects http header NAMES to be case insensitive and thus many proxies may alter the headers during filtering and serialization which is what the envoy proxies running in our kubernetes systems are doing. https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/header_casing

I believe that the proper fix is to simply make the search of the x-amz-target header be case insensitive as the check for the x-amzn-trace-id header is here https://github.com/softwaremill/elasticmq/blob/b31c8a6dd1f7097650b7294ce77044849290c622/rest/rest-sqs/src/main/scala/org/elasticmq/rest/sqs/directives/AnyParamDirectives.scala#L38C32-L38C32

I've verified experimentally (via curl) that Amazon SQS is case-insensitive in accepting the header names.

[edit] included link to x-amzn-trace-id header which uses equalsIgnoreCase

@btalbot thanks. I tested the fix and it works perfectly. I made a PR with the fix #900 and also docker images.

micossow commented 11 months ago

It should be fixed in v1.5.1