getsentry / sentry-python

The official Python SDK for Sentry.io
https://sentry.io/for/python/
MIT License
1.79k stars 472 forks source link

Django URLs with multiple placeholders cut off after first placeholder #2392

Closed Abdkhan14 closed 8 months ago

Abdkhan14 commented 9 months ago

Environment

SaaS (https://sentry.io/)

Steps to Reproduce

  1. The transaction is associated with loading a trace view.
  2. Here's the parent pageload: link
  3. Correct transaction summary: link
  4. Renamed transaction summary: link
  5. The change that introduced the bug: link

Expected Result

Transaction name is/api/0/organizations/{organization_slug}/events-trace/{trace_id}/

Actual Result

Transaction name is renamed from /api/0/organizations/{organization_slug}/events-trace/{trace_id}/ to /api/0/organizations/{organization_slug}/.

Product Area

Other

Link

No response

DSN

No response

Version

No response

smeubank commented 8 months ago

https://github.com/getsentry/sentry-python/discussions/2416

also noted here in this discussion

sentrivana commented 8 months ago

Relevant/might be fixed by: https://github.com/getsentry/sentry-python/pull/2325

sentrivana commented 8 months ago

TL;DR: My proposed course of action for now:


Some background:

The fundamental problem here is that we're trying to parse a non-regular language with regular expressions in the resolver. Any regex we come up with will never fully "work". There will always be counter examples of perfectly fine URL patterns that will not be parsed correctly.

To illustrate, consider these proposed candidates and URL patterns they won't work for (paste e.g. to https://regex101.com/ to play with these):

While this list is definitely not exhaustive, the point I want to make is that any regex we choose will have to come with some assumptions about what constitutes a "correct URL pattern", simply because regexes inherently can't solve this problem. In other words, any change we will make to the regex will introduce a regression, so we should approach any changes to this very carefully. (Or come up with a new approach to named group matching in the resolver.)

salomvary commented 8 months ago

👍 for the proposed course of action.

Would supporting "nice" transaction names for paths only, while re_paths would be exposed raw?

sentrivana commented 8 months ago

@salomvary The problem is that paths can also contain "hidden" regexes, see e.g. https://github.com/getsentry/sentry-python/issues/2446.