POSTing to relay gets blocked by content blockers (again)

etergen commented 3 years ago

Important Details

How are you running Sentry? Saas (sentry.io)

Description

Reporting to sentry.io gets blocked by content blockers. As suggested in #23054, and as previously solved in #24947, I deployed relay, which solved the issue temporarily.

This change (commit) to EasyList renders the previous fix useless.

Steps to Reproduce

See the referenced commit

What you expected to happen

Allow the POST to reach Sentry

I propose to construct the URL in a generic way, e.g.

use key instead of sentry_key
use v instead of sentry_version
get rid of /store/ maybe?

etergen commented 3 years ago

And as @BYK did last time, I'm simply pinging @getsentry/owners-ingest for triage. (special ping to @untitaker and @jan-auer)

etergen commented 3 years ago

pinging @getsentry/owners-ingest for triage

That didn't work as expected... lesson learned 😄

jan-auer commented 3 years ago

That didn't work as expected... lesson learned 😄

Nice try, but that one's off limits :) Let me do you the honors and route this to @getsentry/owners-ingest for you.

This change (commit) to EasyList renders the previous fix useless.

This is particularly challenging, since this contains required information and our JavaScript SDKs send this as query parameter to prevent CORS preflights. Apart from playing a cat-and-mouse-game with EasyList/EasyPrivacy, the best immediate option I can see is to open an issue on their tracker.

BYK commented 3 years ago

@jan-auer I think it it reasonable to rename the GET variables?

jan-auer commented 3 years ago

@jan-auer I think it it reasonable to rename the GET variables?

That's what I mean with a cat-and-mouse-game. It's a dirty workaround, until someone adds new rules. In the end, it doesn't solve the underlying issue because some people are considering Sentry a privacy concern.

BYK commented 3 years ago

@jan-auer agree with your point but this change will make it quite harder for them to target Sentry as we stop screaming "THIS IS SENTRYYY!!" in all requests :D

BYK commented 3 years ago

@etergen btw, I think you can be smart with your LB/router and remap these variables. This means you'd also need to tweak the SDKs but nobody said this was a clean workaround :)

etergen commented 3 years ago

Thanks for the advice, and while I'm sure I'll fix it for my current scenario, I strongly believe we should provide a solution to everyone relying on Sentry, otherwise what I'll do will only serve my use case and, sadly, implementing an in-house solution, or another system would be more attractive/effective than using Sentry.

I must insist on considering one of the below alternatives:

Get rid of the word sentry from the request params (maybe use what I proposed in my first post?)
Move these 2 params to the request headers
Move these 2 params to the request payload

As @BYK correctly stated, today you are literally screaming at the world that the request is about sentry, making it easier to block. Although you won't be able to get rid of blocks on the sentry.io domain, you have everything at your disposal to avoid playing the cat-and-mouse game for those of us using the relay option.

To sum up, ideally we should be able to issue requests to a relay server in the form of <protocol>://<server>/api/<key> (e.g. https://example.com/api/99999999) with the DSN and sentry version inside the payload (or the headers if you prefer).

What do you think?

nolanshaw800 commented 3 years ago

Hello! I would like to add my input that being blocked by ad blockers is an absolute deal-breaker with Sentry.

My company relies upon the data Sentry provides and if it continues to be being blocked by the default ublock/mainstream adblocker settings then we will need to search for another product or roll our own.

The main HTTP GET variable which easylist is using to track sentry "sentry_key" should be and/or user configurable.

Thank you.

BYK commented 3 years ago

Thanks a lot for the many and good suggestions @etergen! I personally feel like there's some quick wins here and am discussing with @jan-auer and the team but at the end it will be a collaborative decision 🙂

Thanks for adding your perspective and experience @nolanshaw800, definitely helpful to prioritize the issue.

mitsuhiko commented 3 years ago

Unfortunately changing the authorization scheme is not a straightforward. If we change it in all SDKs now it breaks every non sentry.io installation out there. It would have to become an opt-in mechanism or do some auto detection based on if it sends to sentry.io or a custom installation.

BYK commented 3 years ago

@mitsuhiko I feel like we can easily change and map server-side variable names as a start and then add a manual opt-in flag to SDKs while we figure out a good auto-negotiation way. We may use the options pre-flight requests.

mitsuhiko commented 3 years ago

We're going to discuss this in the ingest team. Going to report back with something if we have a plan.

nolanshaw800 commented 3 years ago

It seems to me this needs to be treated like any API versioning and it wouldn't break anything if done properly.

You have authentication handshake v1 (sentry_key/etc) which will continued to be accepted and you have authentication handshake v2 which just allows optional configuration of the sentry_key query, perhaps on a project level.

mitsuhiko commented 3 years ago

The problem is not that versioning is hard, but that it's a thing users can easily get wrong. The likely outcome would be to have an sdk option to flip the transport format and you need to explicitly switch over to another one. But we're going to look into this. Generally speaking I'm not convinced that a simple change is not going to put us back into the same situation in a week. Easylist privacy keeps blocking even the most generic URLs. Some people had to rename their APIs away from /events since that is blocked.

Spice-King commented 3 years ago

Kinda sad to see this end up blocked a few weeks after I found Sentry, suggested it to my boss and got the green light to set aside resources for it. Domain name was easy to work around, but the sentry_key bit is kinda a hard blocker.

We are kinda in a somewhat unique situation where we have a number of internal services that we wanted reporting for, but also use AD policies to force install and enable u-Block Origin to cut down on IT work caused by people clicking on malicious ads.

If they start to block API paths, much like /events in the past, it might be an idea to fetch a JSON object with route mappings from the server from a generic endpoint say like /api/v2/. Something so generic that it would break other websites to block.

intgr commented 3 years ago

I'm a happy user of both Sentry and UBlock and I'd very much like to see Sentry crash reporting be re-enabled by ad blockers.

But to be fair, Sentry has became a tracker -- it ventured into the dark side by creating features like autoSessionTracking and enabling them by default (!!). Do most developers using Sentry even know or care about session tracking? I know I don't, but it was enabled behind my back simply because I unwittingly updated the Sentry SDK without reading release notes. 😲

This is exactly the sort of gratuitous privacy violation that EasyList intends to block, it's clear as day that the Sentry JavaScript SDK in its default configuration falls under the EasyPrivacy policy.

Instead of behaving like EvilCorp, playing the cat-and-mouse game, setting up proxies etc, I think the only reasonable thing to do is to show some goodwill: open a dialog with EasyList. Communicate clearly what sort of data is collected, and ask under what conditions they are willing to lift the block. Mayhaps the session-tracking APIs can be separated from crash reporting APIs. Maybe Sentry could detect the blocking and anonymise more information in the report. etc.

joshkel commented 3 years ago

@intgr, it looks like the autoSessionTracking feature was enabled by default as part of a major (breaking) release, was documented in the changelog, and is part of the health feature. I'm not personally using it, but I can see how it's useful. Tracking the existence of a healthy session seems different than tracking users and behavior and seems different than going to the dark side.

Sentry tried to open a dialog with EasyList (see comments from @mitsuhiko and @dcramer on #easylist/6963), but it didn't go anywhere.

intgr commented 3 years ago

Sentry tried to open a dialog with EasyList (see comments from @mitsuhiko and @dcramer on #easylist/6963), but it didn't go anywhere.

That's... interesting. Maybe they didn't buy it because the communication wasn't entirely forthcoming about Sentry features?

The autoSessionTracking feature was enabled by default on 2021-01-19

Literally two days later, the CTO of Sentry tries to hand-wave Sentry as "captur[ing] code diagnostics" and says "we don't provide any kind of user 'tracking' functionality whatsoever."

One could argue about the definition of "tracking" and "code diagnostics", but the exchanges make it sound as if Sentry only collected information when there was a crash -- that is a half-truth at best and I'm sure the CTO knows it. The reality is that in the default configuration, every pageload submits session information to Sentry.

it looks like the autoSessionTracking feature was enabled by default as part of a major (breaking) release, was documented in the changelog

True, I accept the blame for not reading the changelog. :)

dcramer commented 3 years ago

@intgr The problem is these lists are maintained often by non-technical people, with nothing to gain from curating them correctly. The only negative outcome they have is "people will open tickets and complain". Unfortunately for us, and arguably the entire industry, that's going to remain true.

I will still stand by my original comments. Our session tracking is aggregate code diagnostics at the end of the day. It gives you information about the stability of an app, as well as the adoption rate (which is important to understand rollouts). It doesn't, however, let you understand user behaviors, or any of the other traditional functionality of an analytics product.

The actual outcome of most of these: if your page loads a 3rd party script, or sends data to a 3rd party service, its going to end up on a blocklist.

That might mean we provide a better mechanism for deploying https://github.com/getsentry/relay for end-users in the future to work around this.

sitompul commented 2 years ago

hi @dcramer, will deploying an nginx proxy instance help? If it does, do you have any nginx conf recommendation? Or do you have any workaround with relay?

dcramer commented 2 years ago

@sitompul we provide a proxy called Relay which would work around this: https://docs.sentry.io/product/relay/

reuben commented 1 year ago

Is this issue still valid? I was just looking into Relay precisely to get around default lists, but it seems it wouldn't help me at all. Our customers are relatively technical and so adblocker usage can be high, and getting vague reports about errors without being able to act on them feels like going back to the dark ages.

notf0und commented 1 year ago

As the OP mention, probably the best solution is to allow the sentry clients/servers to customize the name of the keys and/or allowing to be sent as part of the post request body and not as query.

The domain is not a problem since always you can use a proxy or not having sentry word on the domain when using on-premise.

With that the level of customization should be enough to avoid adblockers.

mitsuhiko commented 1 year ago

@notf0und the tunnel functionality allows for this: https://docs.sentry.io/platforms/javascript/troubleshooting/#using-the-tunnel-option

sandstrom commented 1 year ago

Problem

Our customers are relatively technical and so Adblocker usage can be high, and getting vague reports about errors without being able to act on them feels like going back to the dark ages.

We're in the same boat. Makes debugging JS errors hard.

Background

This is the line that's blocking Sentry (in one list, at least):

https://github.com/easylist/easylist/blob/2fdf00ce/easyprivacy/easyprivacy_general.txt#L347

I did some quick research, and it seems like blocking based on an HTTP body is still rare.

https://github.com/uBlockOrigin/uBlock-issues/issues/1357

Solution

Allow Managed Relay to accept the key in either the body (a key in the JSON payload), or as a query param, and in both cases with an arbitrary name (one we can choose ourselves).

Relay can then forward it to Sentry in a standardised format (same as today, i.e. relay would move the key to a standard location/name). That should allow JS error reports to be received, when using Sentry with Relay.

Next level of obfuscation would probably to also base64 encode the HTTP body, and have relay also decode it, before forwarding.

getsentry / sentry