mozilla / addons

☂ Umbrella repository for Mozilla Addons ✨
Other
125 stars 41 forks source link

alert us somehow if import_blocklist fails #7520

Closed eviljeff closed 4 years ago

eviljeff commented 4 years ago

from mozilla/addons#7506:

So if any part of this process fails errors should result in an alert so we know.

@muffinresearch did you have a type of alert in mind? And is it just that the fetch fails from kinto or something else too? e.g. is there a certain number of block removals that should trigger it?

muffinresearch commented 4 years ago

I was initially thinking of something around at least ensuring that if the cronjob errors because the import blew-up that we can have that email sent to a mailing list that is monitored.

The main thing we'd like to guard against is that the blocklist stops updating without us noticing.

eviljeff commented 4 years ago

Grafana dashboard created: https://earthangel-b40313e5.influxcloud.net/d/IWZpIQgMk/blocklist

We only have metrics from stage currently which makes testing everything a little difficult so I'm going to have to duplicate the stage charts once the code is on prod.

eviljeff commented 4 years ago

QA: run through the same tests as for https://github.com/mozilla/addons/issues/7506 - we'd like to see all the metrics being recorded on the dashboard (if you can't access Grafana I can check that)

eviljeff commented 4 years ago

oh, and we can test the alert itself by disabling the blocklist_auto_import waffle switch during the window where it would normally run (05:35 utc +6,12,18hr) - I'll get an email.

AlexandraMoga commented 4 years ago

So, I've prepared some data for today, but the blocklist_auto_import waffle switch was enabled all the time. The data added in kinto was correctly imported in AMO. @eviljeff I don't have access to Grafana to check what's there.

I can try again with another data set, this time making sure that the blocklist_auto_import is disabled.

eviljeff commented 4 years ago

@AlexandraMoga I saw, thanks - they showed up in grafana.

I can try again with another data set, this time making sure that the blocklist_auto_import is disabled.

it should trigger even if there there aren't any further changes - just turning off the waffle is enough. I'll ask ops and verify that part myself.

eviljeff commented 4 years ago

I received an alert email. it's pretty ugly but it works.