department-of-veterans-affairs / va.gov-team

Public resources for building on and in support of VA.gov. Visit complete Knowledge Hub:
https://depo-platform-documentation.scrollhelp.site/index.html
284 stars 206 forks source link

Discovery: warnings on increase in sentry errors #361

Closed kfrz closed 5 years ago

kfrz commented 5 years ago

Background

(copied from vets.gov-team #17566)

@wyattwalter commented on Mon Mar 25 2019

On Friday we had a situation where error reports to Sentry spiked up pretty significantly shortly after a deploy to vets-website. The errors didn't cause any user-facing issues in this situation, but it went undetected other than the fact that Sentry became overloaded and slow.

This type of situation likely warrants some sort of notification. We should explore a good route for this. I can think of a few angles:

1) Use metrics from the revproxy or ELB to alert via Prometheus 2) Whatever alerting capability might exist in Sentry 3) Have the deployment tooling check for this scenario in the few minutes after a deployment is considered complete.


@wyattwalter commented on Tue Apr 02 2019

We could also do some alerting from data from Sentry itself. The revproxy metrics only include js-report and csp-report information. There's a community exporter that may help: https://github.com/snakecharmer/sentry_exporter


@kfrz commented on Wed May 08 2019

https://github.com/department-of-veterans-affairs/vets.gov-team/issues/18052


@kfrz commented on Fri May 17 2019

Next steps:

i.e.: Change

if slack:
    SENTRY_OPTIONS['slack.client-id'] = env('SLACK_CLIENT_ID')
    SENTRY_OPTIONS['slack.client-secret'] = env('SLACK_CLIENT_SECRET')
    SENTRY_OPTIONS['slack.verification-token'] = env('SLACK_VERIFICATION_TOKEN') or '' 

to look like something like:

SLACK_CLIENT_ID = `{{ sentry_slack_client_id }}`

@kfrz commented on Fri May 17 2019

@wyattwalter We will likely need devops assistance in order to get the setting into Credstash and then auth the workspace. cc: @annaswims


@kfrz commented on Mon May 20 2019

From @annaswims via slack:

The sentry js client makes calls directly to the server, therefore the server must be exposed to the web somehow. I think we just need to learn about the Sentry ELB referenced here https://github.com/department-of-veterans-affairs/devops/blob/5033e0750c15062883280a93c9ce08e553166a20/ansible/deployment/config/sentry/README.md#application-support


@annaswims commented on Wed May 22 2019

We're planning on adding a slack integration as part of https://github.com/department-of-veterans-affairs/vets.gov-team/issues/17984

There's also a Pagerduty integration that initially looked promising but https://github.com/getsentry/sentry-plugins/pull/469 leads me to believe that the integration won't work until a new release of sentry happens and that could be a long wait because they've announced that the most recent release will be the last before significant dependency changes.

alexpappasoddball commented 5 years ago

Closing because this is the discovery for work already in progress: https://app.zenhub.com/workspaces/vsp-5cedc9cce6e3335dc5a49fc4/issues/department-of-veterans-affairs/va.gov-team/304