openzipkin / zipkin-aws

Reporters and collectors for use in Amazon's cloud
Apache License 2.0
69 stars 34 forks source link

SQS Collector fails to delete corrupt messages #79

Closed llinder closed 6 years ago

llinder commented 6 years ago

When the SQS collector encounters a message that fails to deserialize it should log and delete the offending message. If this doesn't happen the bad message will cycle back through the queue and continue to fail.

Stack trace for reference java.lang.RuntimeException: Cannot decode spans at zipkin.internal.Collector.doError(Collector.java:144) [io.zipkin.java-zipkin-2.4.5.jar!/:na] at zipkin.internal.Collector.errorReading(Collector.java:119) [io.zipkin.java-zipkin-2.4.5.jar!/:na] at zipkin.internal.Collector.errorReading(Collector.java:114) [io.zipkin.java-zipkin-2.4.5.jar!/:na] at zipkin.internal.Collector.acceptSpans(Collector.java:59) [io.zipkin.java-zipkin-2.4.5.jar!/:na] at zipkin.internal.V2Collector.acceptSpans(V2Collector.java:43) [io.zipkin.java-zipkin-2.4.5.jar!/:na] at zipkin.collector.Collector.acceptSpans(Collector.java:112) [io.zipkin.java-zipkin-2.4.5.jar!/:na] at zipkin.collector.sqs.SQSSpanProcessor.process(SQSSpanProcessor.java:109) [zipkin-collector-sqs-0.8.7.jar!/:na] at zipkin.collector.sqs.SQSSpanProcessor.run(SQSSpanProcessor.java:75) [zipkin-collector-sqs-0.8.7.jar!/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_152] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_152] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_152] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_152] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_152] Caused by: java.lang.IllegalArgumentException: Empty endpoint at $[3].remoteEndpoint reading List from json at zipkin2.internal.JsonCodec.exceptionReading(JsonCodec.java:229) ~[io.zipkin.zipkin2-zipkin-2.4.5.jar!/:na] at zipkin2.internal.JsonCodec.readList(JsonCodec.java:142) ~[io.zipkin.zipkin2-zipkin-2.4.5.jar!/:na] at zipkin2.codec.SpanBytesDecoder$1.decodeList(SpanBytesDecoder.java:38) ~[io.zipkin.zipkin2-zipkin-2.4.5.jar!/:na] at zipkin.internal.V2Collector.decodeList(V2Collector.java:48) [io.zipkin.java-zipkin-2.4.5.jar!/:na] at zipkin.internal.V2Collector.decodeList(V2Collector.java:29) [io.zipkin.java-zipkin-2.4.5.jar!/:na] at zipkin.internal.Collector.acceptSpans(Collector.java:57) [io.zipkin.java-zipkin-2.4.5.jar!/:na] ... 9 common frames omitted Caused by: java.lang.IllegalArgumentException: Empty endpoint at $[3].remoteEndpoint at zipkin2.internal.V2SpanReader$1.fromJson(V2SpanReader.java:134) ~[io.zipkin.zipkin2-zipkin-2.4.5.jar!/:na] at zipkin2.internal.V2SpanReader$1.fromJson(V2SpanReader.java:109) ~[io.zipkin.zipkin2-zipkin-2.4.5.jar!/:na] at zipkin2.internal.V2SpanReader.fromJson(V2SpanReader.java:59) ~[io.zipkin.zipkin2-zipkin-2.4.5.jar!/:na] at zipkin2.internal.V2SpanReader.fromJson(V2SpanReader.java:22) ~[io.zipkin.zipkin2-zipkin-2.4.5.jar!/:na] at zipkin2.internal.JsonCodec.readList(JsonCodec.java:138) ~[io.zipkin.zipkin2-zipkin-2.4.5.jar!/:na] ... 13 common frames omitted

codefromthecrypt commented 6 years ago

seems like something we can handle in the json parser. we can also handle it in general in the collector.

want to take the collector side?

llinder commented 6 years ago

I'm working on a test with a fix for the collector already and will hopefully have something done still today. Fixing in the JSON parser would be good as well so any help there would be awesome :)

llinder commented 6 years ago

I believe this should take care of malformed JSON cases: https://github.com/openzipkin/zipkin-aws/pull/80

codefromthecrypt commented 6 years ago

See also https://github.com/openzipkin/zipkin/pull/1992 thanks @llinder