prometheus-community / jiralert

JIRA integration for Prometheus Alertmanager
Apache License 2.0
335 stars 129 forks source link

jiralert returns HTTP 500 when jira user does not have access to project #53

Open nvtkaszpir opened 4 years ago

nvtkaszpir commented 4 years ago

Looks like jiralert returns HTTP 500 when it tries to access remote jira instance and uses good credentials but the given user does not have access to the given jira project. Also it helps to enable json log format to catch the error properly, logfmt is sometimes producing not very useful error messages.

example error: err:

JIRA request https://REDACTED.atlassian.net/rest/api/2/search?jql=project%3D%22XX%22+and+labels%3D%22ALERT%7Balertname%3D%5C%22KubeJobFailed%5C%22%2Ccluster_name%3D%5C%22REDACTED...+order+by+resolutiondate+desc&startAt=0&maxResults=2&expand=&fields=summary,status,resolution,resolutiondate&validateQuery= returned status 403 Forbidden, body ""

msg: error handling request statusCode: 500 statusText: Internal Server Error

I think jiralert could pass back 403 error from jira?

nvtkaszpir commented 4 years ago

similar issue was reported in #35

free commented 4 years ago

You are correct, JIRAlert is setting the retry flag to true if Jira's response code is 500 or 503, false otherwise. Then, if the retry flag is true JIRAlert responds to Alertmanager with a 503 Service Unavailable, else with a 500 Internal Server Error.

I guess it could respond with the original status code, but would that make any difference WRT Alertmanager's behavior? Currently the intent is to simply have Alertmanager retry the webhook as long as we respond with a 503 (i.e. if Jira responded with 500 or 503) and not retry it otherwise.

But looking at the Alertmanager code, it looks like it will retry all 5xx codes for webhooks (and only those) so it may make sense to either pass through the original response code or simply respond with something other than 500 Internal Server Error if JIRAlert decides it is not a retriable error.