getsentry / sentry

Developer-first error tracking and performance monitoring
https://sentry.io
Other
38.53k stars 4.12k forks source link

Sentry Alerts do not always create new JIRA issues #46241

Open InterstellarStella opened 1 year ago

InterstellarStella commented 1 year ago

Environment

SaaS (https://sentry.io/)

Version

No response

Link

No response

DSN

No response

Steps to Reproduce

  1. Install the JIRA cloud integration in Sentry SaaS
  2. Create a new alert, configure the Issue Link Settings, and click on "Send Test Notification"

Expected Result

A new JIRA test issue is created every time

Actual Result

Test issues are not being created consistently.

Things that have been checked:

Ticket on Zendesk

getsantry[bot] commented 1 year ago

Assigning to @getsentry/support for routing, due by (yyz). ⏲️

getsantry[bot] commented 1 year ago

Routing to @getsentry/ecosystem for triage, due by (sfo). ⏲️

huelsmc commented 1 year ago

We have the exact same behavior with Sentry and Jira Cloud. For some projects it works but for some Sentry reports the ticket has been created but nothing to find in Jira.

getsantry[bot] commented 1 year ago

Routing to @getsentry/product-owners-settings-integrations for triage ⏲️

dneighbors commented 7 months ago

We have been experiencing this since at least Sept 7th, 2023 and have been watching this and waiting.

theboz86 commented 7 months ago

My team is also having the same issue, about 70% of our Sentry Alerts create a ticket in Jira. I reached out to support 6 months ago and am yet to hear back. This is spread throughout my company.

malwilley commented 7 months ago

@theboz86 thanks for chiming in - I'm trying to reproduce this but have been unable to. Could you clarify a couple things to help with debugging this issue?

MrLuna12 commented 4 months ago

My team has been experiencing this issue for the past 8 months.

dneighbors commented 4 months ago

@malwilley here is an example of a rule that exists and is live and creates JIRA tickets 70% of the time. Included are two issues that fired within last 24hrs that did NOT create JIRA tickets from the rule.

alert rule id:15093502 issue id:5328199593 in last 24hr no JIRA ticket creation issue id:5327281704 in last 24hr no JIRA ticket creation:

sentaur-athena commented 4 months ago

Thank you for reporting and providing examples. I started investigating and will provide update when I have one.

sentaur-athena commented 4 months ago

@MrLuna12 (and everyone else experiencing this) there could be different reasons for a jira ticket not getting created depending on how the rule is setup. Would be great if you can provide me with the rule that is failing to create issues.

Meanwhile I work off the one example @dneighbors provided.

sentaur-athena commented 4 months ago

@dneighbors what I can see in the logs is that your ticket creation fails on the Jira side with error: {'customfield_10102': ["Team id 'JsonData{data={id=S*****}}' is not valid."]}. Added stars to the name for privacy but is that possible that this is not a valid team in your Jira settings?

dneighbors commented 4 months ago

@sentaur-athena When I look at that configuration of the rule it isn't passing a team. Where is it pulling team from in Sentry? Additionally, it is out that 70% of the time the rule works. Is there a way in sentry to view this error?

dneighbors commented 4 months ago

@sentaur-athena When I look at that configuration of the rule it isn't passing a team. Where is it pulling team from in Sentry? Additionally, it is out that 70% of the time the rule works. Is there a way in sentry to view this error?

Okay we found in automation on the Jira side a team was being set incorrectly. Im still curious why intermittent and would love to know if way to see logs on these things if we have additional issues, but this may resolve for us.

sentaur-athena commented 4 months ago

@sentaur-athena When I look at that configuration of the rule it isn't passing a team.

That's something to be fixed on our side. It doesn't show any teams because there are no valid teams. Instead if we show you an error it would be more useful. I created an internal ticket for improving this experience.

Im still curious why intermittent

That I didn't figure out either. Please message again if this fix doesn't lead to 100% creation. This one the only failure I saw in the logs today.

would love to know if way to see logs on these things

Not right now. But that's also a clear next step for us so I added it to our upcoming improvements.

tpaulshippy commented 3 months ago

@sentaur-athena This is still not working for us. Is there any way for you to check the logs for us again?

sparklepop commented 2 months ago

@sentaur-athena @malwilley We are still experiencing this issue. Is there any way to share the error in your logs with us to enable further troubleshooting? As tests we have tried 1) disabling on create automations on the Jira side 2) Passing the Jira team GUID in the Sentry alert Team field 3) Removing any content in the Sentry alert Team field 4) Adding Team and customfield_10102 to the hide data in the Sentry Jira integration "Ignored Fields"

leedongwei commented 2 months ago

Hi folks, are y'all from the same org? Can you share your Project DSN so we can identify the affected organization + project? Thanks

tpaulshippy commented 2 months ago

Yes we are all from the same org. Here is the DSN: https://41911b9a139d4652bf09f7f77e9b013b@o1169445.ingest.us.sentry.io/4505229541441536

markng commented 1 month ago

The labels here seem to indicate that this has been deprioritised within your organisation. Will you let me know if that's the case? This is causing significant issues for us and our workflow, and is a feature that is supposed to work.

We are a paying customer, and this is beginning to create a sentiment to move to an alternate product. I understand that we're saying that JIRA is returning errors for you, but if so, we need to be able to understand what they are (and why they are inconsistent) so we can raise relevant support cases with Atlassian if that is the case.

markng commented 1 month ago

Additionally to other things we have mentioned here that we have tried is that we have stood up a local development copy of the open source version of your product, and have been unable to replicate the issue.

leedongwei commented 1 month ago

Hi, we've scheduled engineering time to address this. Don't read too much into the labels changing, I'm just trying to stop the internal bot from pinging me non-stop about it.

markng commented 1 month ago

Thanks for the update. Honestly, even having your integrations for our org reporting into a sentry project ( I assume there are exceptions being raised here ) would be perfectly good for our needs.

markng commented 1 month ago

and do you have any estimates in time? I believe we might even be prepared to devote some of our own developer time toward this, because the next steps on our side are either we write something to crawl your API, or we start moving to something else, both of which are sub-optimal for us, and would likely cost us more time than helping out. (I assume the open source version of sentry is close enough to your hosted version?)

leedongwei commented 1 month ago

I have an engineer scheduled to investigate + fix Jira bugs starting next week. This is the 2nd item on their priority list. IMO end of the month seems reasonable.

On the integration reporting errors into a Sentry project, we have floated similar ideas like an audit log of requests/responses between Sentry and your integration, but haven't fully fleshed it out yet.

markng commented 1 month ago

I have an engineer scheduled to investigate + fix Jira bugs starting next week. This is the 2nd item on their priority list. IMO end of the month seems reasonable.

Thanks. Again, we're invested in getting this fixed (I've failed a couple sprints now!) so if we can either help you test this or contribute to getting it done, please let me know.

markng commented 1 month ago

Checking in to see if we're still in a good spot in terms of schedules, and whether we can be doing anything to help with the process.

GabeVillalobos commented 1 month ago

@markng I started triaging the problem this week and am able to reproduce it. It's likely this gets fixed sometime later this week or early next week, but I'll provide better estimates on this when I have more information.

markng commented 1 month ago

@markng I started triaging the problem this week and am able to reproduce it. It's likely this gets fixed sometime later this week or early next week, but I'll provide better estimates on this when I have more information.

fantastic! I appreciate the update.

tpaulshippy commented 1 month ago

@markng I started triaging the problem this week and am able to reproduce it. It's likely this gets fixed sometime later this week or early next week, but I'll provide better estimates on this when I have more information.

Hi @GabeVillalobos - thanks for your work on this. Do you have any better estimates on this effort at this point?

GabeVillalobos commented 1 month ago

Sorry for the delay. So I've been looking into the following problems:

  1. We don't seem to propagate API level errors to the UI when issuing test alerts. This makes it difficult to diagnose when and why an alert isn't firing. [Fix in Progress]
  2. Because many of the issue fields in our alert configuration are dynamically provided by Jira, adding validation for them can be difficult. Unfortunately, this makes it very easy to set up invalid alert rules.

I'm hoping to have a patch ready for the 1st problem by end of week, although I can't make any guarantees on the granularity of the error messages we can propagate as these issue creation requests can fail for a variety of reasons, some of which may include sensitive information. As for the 2nd problem, I'll be looking into what it takes to add stricter validation to some of these fields (such as Team for example).

GabeVillalobos commented 1 day ago

It's taken me quite a bit of time to fully get up to speed on our Jira integration so I appreciate your patience and apologize for the delays. Here are some updates on this:

  1. We decided to refactor a decent portion of our Jira integration. Quite a few of the available custom fields users can configure have been broken for a while which contributes to these alerting failures.
  2. We've added a data model to add more type safety to our integration code with the intention of making maintenance easier in the future.
  3. We are currently testing our test alert error propagation changes from PR #76369 and will release these when we have more confidence that they provide helpful feedback.

We're hoping these changes fix the vast majority of your issues, or at the very least, provide some feedback as to why your alerts are not working correctly.