Closed jgmize closed 5 years ago
Jul 19 18:12:19 ip-172-20-36-167 kubernetes.var.log.containers.bedrock-prod-web-4250502162-k8x03_: {"log":"BasketException: ('Connection aborted.', error(104, 'Connection reset by peer'))\n","stream":"stdout","time":"2017-07-20T01:12:18.496291567Z"}
Should only cause issues on pages that use basket; not the / url that had 2 504s in a row in from the NR synthetics monitor linked above
No 5XX ELB errors in the last 3 hours on the bedrock-prod ELB according to https://ap-northeast-1.console.aws.amazon.com/ec2/v2/home?region=ap-northeast-1#LoadBalancers
Text of NR error:
www.mozilla.org Monitor
Tokyo, JP Location
07/20/2017 00:52:34 UTC Time
Error log:
HTTPError: Server replied with a HTTP 504 response code
Text of NR alert closed notification:
www.mozilla.org Monitor
Tokyo, JP Location
07/20/2017 00:54:30 UTC (1 minute 55 seconds downtime)
The 1 minute 55 seconds of downtime
above measures the time of the first 504 to the time the 3rd request from synthetics received the 301 response it expected on that URL.
No 504 error codes were returned by bedrock itself in at least the past 3 days according to https://papertrailapp.com/groups/4220732/events?q=bedrock%20%22%20504%20%22&focus=824486437775958033
https://www.cloudflare.com/a/analytics/mozilla.org/status_codes shows 20 504 errors from Osaka, Japan in the past hour. Looks like 2 of those 20 were seen by the NR synthetics monitor in a row, which triggered the alert.
https://papertrailapp.com/groups/4220732/events?q=error https://synthetics.newrelic.com/accounts/1299394/monitors/81a0480c-5c13-4539-8063-a38472f3d2d2/results/4ddedfbe-a7b2-4def-82d9-7d0ba789a7dc?via=email&view=timeline