Open TPS opened 5 months ago
Definitely also see https://github.com/DandelionSprout/adfilt/discussions/163
I do use LUS, but am hoping to improve coverage for these trackers.
Not identical, but now I think LUS is a derivative of ClearURLs (& probably other sources), so maybe this is duplicate in some sense? If you'd comment on the relationship between the 2, @DandelionSprout, it'd help.
Conflict of interest disclaimer: I am the assistant maintainer of the Actually Legitimate URL Shortener Tool, and current maintainer of the ClearURLs for uBo list (I did not create the original ClearURLs for uBo list; credit for that goes to rustysnake)
DandelionSprout's LUS is a derivative of ClearURLs (& probably other sources)
It is not. While a few filters have been copied from elsewhere (with credit), most have been manually added either based on user reports or tracking parameters Imre (and I) found. Thank you
@iam-py-test Thanks very much for answering. 🙇🏾♂️ Could you comment on how different the contents of the 2 lists are from each other?
The Actually Legitimate URL Shortener, as described, is a variety of rules manually added by Imre (DandelionSprout) and me. ClearURLs for uBo uses a Python script to convert the ClearURLs rules into a filterlist for uBlock Origin and AdGuard (basically what you requested here). There are a few modifications to remove problematic rules, but largely it's just the ClearURLs rules. Thanks
In theory, I could potentially have attempted to merge relevant entries from ClearURLs into LUS, which I can only presume would be a win-win for most parties.
@DandelionSprout 🙇🏾♂️ Actually, if the contents are that different, it'd make sense to keep them separate, & offer each as AG options to supplement each other & AG's other Privacy filterlists. OTOH, if the included rules overlap significantly, then it would make sense to use 1 as another source for the other, to keep down duplication.
So, I ran a comparison this morning about whether ClearURLs had any coverage that LUS didn't. I decided to test with Amazon, a high-coverage site in both lists.
LUS had well above 80 entries for Amazon (70 of them being specific entries). Only 2 entries that made sense (e.g. not ones like keywords
or _encoding
) had been in ClearURLs but not in LUS.
Although I do have conflicts of interest in the matter, I'd say that at this point ClearURLs has been obliterated in comparison. I give iam-py-test full 100% rights to make the calls on the following, with no interference from me, but I personally am getting unsure if a ClearURLs list conversion would be considered necessary nowadays. 😓
That's reasonable methodology. Possible to be more comprehensive over domain variety, like this is for TLD variety? I've a hunch that far-less-well-known sites than Amazon may have wider coverage on ClearURLs.
Possible to be more comprehensive over domain variety, like https://github.com/StevenBlack/hosts/issues/1181#issuecomment-608229213?
Given both lists have many global (applies to all websites) rules, measuring such coverage would be difficult.
It is definitely worth testing which permissions deactivate the global removeparam (AdGuard only):
removeparam rules can also be disabled by
$document
and$urlblock
exception rules. But basic exception rules without modifiers do not do that. For example,@@||example.com^
will not disable$removeparam=p
for requests to example.com, but@@||example.com^$urlblock
will.
Then the script "user.js" with API to edit parameters will probably work better on locked ranges.
https://adguard.com/kb/general/ad-filtering/create-own-filters/#urlblock-modifier
Prerequisites
Problem description
The ClearURLs database might be be transformed into a powerful privacy-enhancing filterlist &/or userscript.
Proposed solution
The specs @ https://docs.clearurls.xyz/latest/specs/rules/ would be utterly necessary to transform this to something end-usable.
Additional information
Originally found via https://github.com/svenjacobs/leon/discussions/315#discussioncomment-9809441, where several interrelated projects are thinking of how to incorporate this database themselves.