uBlockOrigin / uBlock-issues

This is the community-maintained issue tracker for uBlock Origin
https://github.com/gorhill/uBlock
945 stars 81 forks source link

Option to block any pixel-sized images #92

Closed emanruse closed 6 years ago

emanruse commented 6 years ago

It is possible to have a tracking pixel which is a 1st party resource (e.g. img.github.com) which can still internally redirect the info it collects to another party (e.g. analytics.sometracker.com). Internally means - not through XHR (which would be detectable on the frontend) but in the web application backend, which we cannot block. I have seen such pixels and the backend code which redirects to Google Analytics HTTP API and other similar services. Currently I am unaware of any way to block such a pixel as it would appear as "legitimate" image to uBO.

Although it is not a panacea (it is possible to have such internal backend redirects with any image or even without an actual image/pixel) it may probably be a good idea to have an option which allows to block any image smaller than NxN pixels and/or which displays data: which is similar to well known empty pixels.

Thoughts?

uBlock-user commented 6 years ago

Internally means - not through XHR (which would be detectable on the frontend) but in the web application backend, which we cannot block.

It cannot send any info to third-party firms before collecting it from you -

The scenario will be like --

Browser(POST request) ---> img.github.com ----> analytics.example.com

For img.github.com to collect info from you, it will have instruct the browser to send info via a POST request to itself and that POST can be blocked as it can be seen through the webRequest API.

gorhill commented 6 years ago

Please provide an actual case which can be investigated, not a theoretical one.

gorhill commented 6 years ago

it may probably be a good idea to have an option which allows to block any image smaller than NxN pixels

The only way to know the size of an image is to actually request it from the server.

or which displays data: which is similar to well known empty pixels.

This makes no sense, data: images never result in a network request, they are already inlined in the HTML code.

Given the above, closing as invalid. Feel free to provide better information to make a sound case for what you want, along with real world examples of the occurrence of what you want to address.

emanruse commented 6 years ago

It cannot send any info to third-party firms before collecting it from you -

Of course. The whole point is not to collect it from me (hence the suggestion).

The scenario will be like -- Browser(POST request) ---> img.github.com ----> analytics.example.com

The images on pages are downloaded via GET. Then the server software (PHP, python whatever) has the info about the user (IP, user agent) and can send it to (example) Google Analytics - all that happens on the server, you cannot block it via extension. But if the pixel-sized element is blocked, there will be no GET request and no info will be sent.

The only way to know the size of an image is to actually request it from the server.

Unfortunately - yes. But if you use a HEAD request you can get the size of the image in bytes. So perhaps based on that it is possible to know if the image is a pixel or an actual image, i.e. the whole thing would come down to 'do not download things smaller than N bytes'. Also usually pixels are image/gif type and have non-caching HTTP headers, e.g. Cache-Control: private, no-store, no-cache, must-revalidate, pre-check=0, post-check=0, max-age=0. Also it is possible (though not mandatory) that the HTML DOM would specify <img src="..." width="1" height="1">, so based on that logic the src URL can be blocked.

Is it not possible based on that to block the GET request? Or would it require an overhead (e.g. sending HEAD requests first, then deciding)?

I understand that a HEAD request would still be logged on the web server but usually the functions which process the request (and forward it to the third party) are based on GET to avoid "fake visits".

Feel free to provide better information to make a sound case for what you want, along with real world examples of the occurrence of what you want to address.

https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters

The above documentation explains how it is possible to make direct HTTP requests to Google Analytics without JavaScript. There are quite a few examples on the web demonstrating how to use that and it is surely possible to be used in the context of pixel tracking.

data:

My bad. I actually meant base64. I suppose that would not be possible to block, right?

emanruse commented 6 years ago

Another idea which came to me today: It is possible to:

  1. Detect a pixel sized (or very small) image element on a page which would include actual downloading of the image.
  2. Right after uBO detects that and if that is not matched by any rules: uBO can display a warning like: "This site may be using potentially tracking pixel with URL: ... Do you want to create a rule to block all URLs like these (on this site/ on all sites)" That would still sacrifice a single GET but could help long term.
gwarser commented 6 years ago

You can create userscript for this. Prototype https://greasyfork.org/en/scripts/40639-detect-tracking-pixels (~abandoned~ not reliable, a toy)

Extension will be better, can access http requests.

emanruse commented 6 years ago

Thanks for the info @gwaser. Yes - that is along the idea about detecting what's in the DOM. But I don't think it needs a separate extension for detection. If @gorhill would like to make uBO do this it would integrate better.

gwarser commented 6 years ago

I'm pretty sure gorhill see this as a Feature creep

You may ask Privacy Possum authors for this feature, they do something similar for ETags https://github.com/cowlicks/privacypossum#etag-tracking

emanruse commented 6 years ago

I'm pretty sure gorhill see this as a Feature creep

I don't see why. What I suggest would improve privacy protection which aligns with the philosophy of the software.

joey04 commented 6 years ago

Interesting topic.

from the wiki

Essentially, a redirection filter must always have a destination hostname specified, or * if the filter is to apply to all destinations.

So, in theory, you could make global rules to block specific pixel patterns then redirect them to the local surrogate. No request sent to the server.

I haven't made any redirect rules of my own, though, so I'm not aware of any gotchas or limitations that may undermine this approach.

gorhill commented 6 years ago

in theory

It's already being used that way:

https://github.com/uBlockOrigin/uAssets/blob/master/filters/unbreak.txt#L265-L270.