Open holidaymike opened 3 years ago
This happens when we run the same functional test in quick succession, and there is a bug? in SQS receiver that when we call StopReceving on the receiver, the receiver may take a while to stop because it is blocked on this line: https://github.com/xmidt-org/ears/blob/main/pkg/plugins/sqs/receiver.go#L160, and it may be able to receive messages from the next test. When this happen, the approximateReceiveCount for a test message will increase, and when the route from the new test gets the message, the count is already set to 1 causing the new receiver to drop the message with the error:
{"log.level":"error","op":"SQS.receiveWorker","workerNum":0,"time":1625675966,"message":"max retries reached for 6c2d4bc5-d6c1-418b-b023-0cd5a5bad423"}
Instead of receiving SQS message using svc.ReceiveMessage(...)
, we should really use svc.ReceiveMessageWithContext(...)
so that we can break out of the call immediately when the SQS receiver needs to stop
When running ears functional test, we sometimes, find that the test messages are timing out in EARS and are not getting delivered with the following error:
Need to investigate