cloudflare / complainer

Complainer's job is to send notifications to different services when tasks fail on Mesos cluster.
MIT License
82 stars 13 forks source link

JIRA Reporter #40

Closed andygrunwald closed 8 years ago

andygrunwald commented 8 years ago

I think about this quite some time. I want to add a JIRA Reporter. The implementation to create a JIRA Ticket based on a failing Mesos Task is not a real challenge and i am able to do this.

But there is one doubt that comes into my mind: Imagine you have a Cronjob that runs every hour (or every 30 minutes, whatever, important is "often). And now it starts failing at 8pm on a friday. The whole weekend the job is failing and complainer is creating tickets for every job run. My idea is to create only one task per Mesos Job ID / Identifier per Task (not per task run). With this we can do something like this:

To know if there is already a ticket created, we need a storage (file, S3, in memory, whatever) or we use a kind of "tricky" way to mark a ticket as this and we use JIRA as a "storage". One idea is to assign tags to a ticket like complainer, mesos-task-id and query by them. In memory won`t be a big deal, but when complainer get rescheduled, your data is gone and ticket will be created twice: Maybe this is acceptable?

The reason for this ticket is not that i want that you start the implementation. The reason is i want to ask you about your idea about this functionality to avoid a ticket creation flooding. Maybe you have a better idea here?

bobrik commented 8 years ago

If there is a ticket that is still not done, skip the error / reporting

How about "add a comment with the info from the new failure"? This can get pretty messy with constantly failing tasks on Marathon, though.

To know if there is already a ticket created, we need a storage (file, S3, in memory, whatever) or we use a kind of "tricky" way to mark a ticket as this and we use JIRA as a "storage".

Sentry collapses similar issues by the title. I think it makes sense to do the same with JIRA.

andygrunwald commented 8 years ago

How about "add a comment with the info from the new failure"? This can get pretty messy with constantly failing tasks on Marathon, though.

Good idea. This can be configurable by a label

Sentry collapses similar issues by the title. I think it makes sense to do the same with JIRA.

Nice idea as well. Can get tricky if the title is templatable. Will think about it.