uazo / cromite

Cromite a Bromite fork with ad blocking and privacy enhancements; take back your browser!
https://www.cromite.org/
GNU General Public License v3.0
3.23k stars 74 forks source link

AdblockPlus does not block fetch in service workers #254

Closed uazo closed 12 months ago

uazo commented 1 year ago

Preliminary checklist

Is your feature request related to privacy?

Yes

Is there a patch available for this feature somewhere?

I'm working on it

Describe the solution you would like

Adblockplus should also block requests made by service workers

Describe alternatives you have considered

Modify the build logic in FrameHierarchyBuilder allowing management even with RenderFrameHost equal to null

uazo commented 1 year ago

@mpawlowski-eyeo can I ask you for confirmation?

uazo commented 1 year ago

@mpawlowski-eyeo I found a simple way.

Basically in AdblockContentBrowserClient::WillCreateURLLoaderFactory it is necessary to exploit the request_initiator (const url::Origin&) which corresponds to the StorageKey of blink, which is nothing else but the origin of the service worker when URLLoaderFactoryType is kServiceWorkerSubResource code

That value I think can be used in ResourceClassificationRunnerImpl::CheckRequestFilterMatchImpl instead of frame_host_id to construct the caller hierarchy.

If this is true, I was wondering if this is something you (eyeo) can do or if I need to intervene, because the changes, although not impactful, would be many, since I think I would have to modify each callback to carry that value. I'll try it anyway and let you know.

uazo commented 1 year ago

unfortunately, it also appears that filtering on websockets (introduced by v116 in service workers) and webtransport are not supported.

test with:

the worst thing is that it seems that not even extensions are able to block them because support is missing in chromium.

PF4Public commented 1 year ago

the worst thing is that it seems that not even extensions are able to block them because support is missing in chromium.

Have you seen any plans by Chromium devs to implement this? Would it be cumbersome to implement and maintain it?

uazo commented 1 year ago

Have you seen any plans by Chromium devs to implement this?

no, the only reference I could find https://bugs.chromium.org/p/chromium/issues/detail?id=1243518 source: https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/chrome_content_browser_client.cc;l=6331;drc=ef67576ec128379c56c1ee33f2668c3aab615500;bpv=1;bpt=1

Would it be cumbersome to implement and maintain it?

I don't know the extensions code, never read it. but I added support for adblock plus (native) in cromite

uazo commented 1 year ago

@mpawlowski-eyeo

https://gist.github.com/uazo/f817bdb2daecb35e25e30158aec96bc8

you can find my work there. it would be nice if you could check the implementation, if you are interested. there is a lack of built-in testing. it would be nice to have it directly in your patch, consider that I had to modify the chromium source to allow blocking since it is not supported.

mpawlowski-eyeo commented 1 year ago

Hi @uazo!

We were aware that we don't block ads served from service workers for a while, but we weren't sure if any websites actually do it, so we didn't prioritize fixing this :) We had a workshop planned with our anti-circumvention team to figure out if this "attack vector" is used IRL.

I'll take a look into your patch, we'll talk it over with the team this week - normally we'd happily integrate it into our codebase but we'll have to evaluate the impact of modifying content_browser_client.h. This interface is implemented 44 times in the repo.

Thanks for the investigation, this is very helpful either way! :)

uazo commented 1 year ago

but we weren't sure if any websites actually do it,

yes, they do, that's the only reason I noticed it. it is not so much the ads but the tracking scripts (for example https://www.akamai.com/it/products/mpulse-real-user-monitoring)

We had a workshop planned with our anti-circumvention team to figure out if this "attack vector" is used IRL.

interesting, also try looking at https://github.com/uazo/cromite/issues/251 I have a patch in wip, but now I am doing the rebase to 117, I will bother you in the future if you want.

we'll have to evaluate the impact of modifying content_browser_client.h This interface is implemented 44 times in the repo.

sure, I guess

mpawlowski-eyeo commented 1 year ago

Do I understand correctly that you only needed to patch content_browser_client.h to fix the websockets-in-service-workers problem, and you can implement everything else without upstream changes?

Do you notice sites using websockets from service workers to serve ads or trackers?

uazo commented 1 year ago

Do I understand correctly that you only needed to patch content_browser_client.h to fix the websockets-in-service-workers problem, and you can implement everything else without upstream changes?

Exactly. is not really supported in chromium, not even for extensions it would seem. That is, there is no extension that is able to block those requests.

Do you notice sites using websockets from service workers to serve ads or trackers?

no, none. but it is not necessary to use service workers to exploit websockets. you can't see it with your scrape of the top 100 site? do you still do that?

mpawlowski-eyeo commented 1 year ago

We do have tests for blocking regular websockets (not from service workers), like the ones on https://abptestpages.org/en/filters/websocket - these pass even without your patch.

We also have data from our scraping that websockets are used to serve ads IRL. About 0.05 percent of all requests that our crawlers emit are websocket requests that we'd block. But our data doesn't show websocket connections initiated from service workers specifically, so we don't know how common that trick is.

The reason this particular change is tricky is because a lot of chromium-based browsers seem to have their own implementation of ContentBrowserClient. Modifying it could force those browsers to do more work when they integrate our code. It's not a definite showstopper, adding those parameters isn't a huge change, but it's something we want to discuss among ourselves.

The other changes (that only touch "our" code) should be much easier to review and integrate - I'll let you know how this goes :)

uazo commented 1 year ago

these pass even without your patch.

yes, you are right! that is what has not been implemented: Issue 1276303: WebSocket w/o a frame also should be captured by WebRequest.

Modifying it could force those browsers to do more work when they integrate our code.

sure, I understand. suggestion: make a new 'non-standard' virtual method that calls up the one if not implemented.

But our data doesn't show websocket connections initiated from service workers specifically, so we don't know how common that trick is.

how I would love to be able to access that data... :)