gorhill / uBlock-for-firefox-legacy

uBlock Origin for Firefox legacy-based browsers.
GNU General Public License v3.0
198 stars 23 forks source link

Revise the scope of the filter syntax converter #228

Closed JustOff closed 4 years ago

JustOff commented 4 years ago

Currently, filter syntax converter applies only to an explicitly specified limited set of filters, and also does not apply to "My filters".

There was a request that this behavior needs to be adjusted.

JustOff commented 4 years ago

is there any particular reason why the conversion is even limited to specified lists

This was done for performance reasons. This was discussed in more detail somewhere deep in General filter chit-chat, but its size does not allow me now to find exactly where it was. The current set of filters to proceed was created based on which of the filters uses the new syntax. We can expand it if I missed something or think of an additional advanced option to enable processing of all filters, but I would not want to enable it by default.

THEtomaso commented 4 years ago

I agree that all filters should be converted, including "user filters" (if you refer to "custom" ones, under uBO's "Filter lists" tab). However, anything that exists under "My filters" should be left alone!

JustOff commented 4 years ago

This translation is not applied to "My filters"?

Yes, and this was done intentionally so as not to confuse users. Filters loaded from remote sources are displayed by uBO already converted and are displayed in the logs in the same way. With "My filters" this will not work, so they are not converted, and displayed and used as is. Theoretically, I can try to add an additional option that will activate the converter for "My filters", if one need this for A/B testing, but I'm afraid that you yourself will get confused.

DandelionSprout commented 4 years ago

The current set of filters to proceed was created based on which of the filters uses the new syntax. We can expand it if I missed something (…)

That approach has the disadvantage that it doesn't automatically account for whenever a new list has moved from the old syntax to the new. For instance, Frellwit's list now has ≥100 entries with the new +js syntax, which I think happened after the inital creation of the conversion script.

JustOff commented 4 years ago

+js syntax is supported by uBO legacy core and don't need a converter.

DandelionSprout commented 4 years ago

Frellwit's list used the old syntax when the conversion began, but has since moved on to the new syntax.

For instance, in uBO 1.16.4.19 on Pale Moon 18.8.4 64-bit, testing on https://www.vasterastidning.se/, the following entries from lines 2774-2777 (preceded by some dozen domains):

##+js(set, youplay.helpers.getTrackUrl, noopFunc)
##+js(set, youplay.helpers.disableRightClick, noopFunc)
##+js(set, youplay.helpers.getTrackUrl, noopFunc)
##+js(set, youplay.helpers.trackUrls, noopFunc)

...are not shown in the uBO logger as having been applied, nor are any other +js entries shown as having been applied either.

THEtomaso commented 4 years ago

Yes, Frellwit's Swedish Filter does indeed use the new filter syntaxes, and so does many other filters, which the converter currently doesn't process!

JustOff commented 4 years ago

Ok, probably then it's worth adding all regional filters to the list for processing by default. However, turning processing on globally will have too much negative impact on performance due to monsters like EasyList or Malware Domains, which are also pointless to process, just like all AdGuard lists.

THEtomaso commented 4 years ago

What about the custom filters? I'm currently using 12 myself, and I know some people that use even more!

--

turning processing on globally will have too much negative impact on performance

How much of a performance drop do you estimate that we're talking about exactly?

JustOff commented 4 years ago

What about the custom filters?

What about an option in the advanced settings to turn on global processing? Unfortunately, the best solution has not yet crossed my mind.

How much of a performance drop do you estimate

This can cause the browser to freeze for several seconds when updating filters.

DandelionSprout commented 4 years ago

Trying to rule out lists that quite clearly intended to target ABP only (e.g. Greek Adblock Filter, EasyList Italy, etc.), gave me these remaining results for lists that do intend to support uBO and its many syntaxes in whichever ways:

* All uAssets lists
* CJX's EasyList Lite
* EasyList Czech and Slovak
* Adblock List for Finland
* EasyList Hebrew
* Oficjalne Polskie Filtry do AdBlocka, uBlocka Origin i AdGuarda
* Frellwit's Swedish Filters
* EasyList Thailand

It's also possible that AdGuard, who are themselves already converting from their own #%#AG_ and/or #%#//scriptlet into uBO's old syntax, may choose to convert into the new syntax instead. If that day comes, all AdGuard lists would have to be whitelisted at the same time.

THEtomaso commented 4 years ago

This can cause the browser to freeze for several seconds when updating filters.

OK, an optional advanced setting sounds like a reasonable solution then. This will enable processing for custom filters too, right?

krystian3w commented 4 years ago

https://github.com/gorhill/uBlock-for-firefox-legacy/issues/224#issuecomment-596205189

FYI :nth-ancestor will be deprecated for :upward soon -- gorhill/uBlock@72bb700

THEtomaso commented 4 years ago

What about an option in the advanced settings to turn on global processing?

Or something like this, perhaps?:

ublockorigin-possiblefunction

JustOff commented 4 years ago

Well, it seems that since the initial implementation of the filter syntax converter, my test results for its performance are somewhat outdated. I checked it again and it turned out that now it looks quite good at handling even large filters like EasyList or Malware Domains.

Here is a test version that has assetConvertAllRemote parameter in the advanced settings:

uBlock0_1.16.4.20b2.firefox-legacy.xpi.zip (rename to xpi)

Could you please check how uBO behaves when this option turned on in your case? Perhaps now we can just apply the converter for all filters (except My Filters)?

There is also assetConvertMyFilters parameter, you can play around with it too if you want.

THEtomaso commented 4 years ago

@JustOff:

Thanks.

Unfortunatley, your latest beta build (v1.16.4.20b2) struggles with custom filters. Even with those new advanced parameters disabled, all custom filters take a long time to update, and I don't care about cookies completely fails!

Steps to reproduce: 1 - Add I don't care about cookies to your custom filters: https://www.i-dont-care-about-cookies.eu/abp/ 2 - Try to update it!

--

EDIT: Upon closer inspection, it looks like this is the cuplrit!: https://github.com/gorhill/uBlock/commit/703c525b01aa3fb9dab94d6a9918a0a69c6d18da @gorhill

JustOff commented 4 years ago

https://www.i-dont-care-about-cookies.eu/abp/

It seems it simply does not respond at all at the moment.

THEtomaso commented 4 years ago

@JustOff: See the edit in my previous post!

JustOff commented 4 years ago

Sorry, but your guess is most likely irrelevant. If something is timed out that no one can load it)

krystian3w commented 4 years ago

gorhill/uBlock@703c525 no copied to legacy, IMO also no exist in beta legacy XPI.

With testing the Kiobke list, wait until each Ctrl+F5 loads in about a second and not after a 2 minutes or 3 minutes.

THEtomaso commented 4 years ago

no copied to legacy

Hmm, I though this script was hosted online?

This seems relevant enough to me:

I verified and could not find any instance in major filter lists of lines ending with \, thus the change should be safe.

Of course, I don't care about cookies ends with /, rather than \, but the change was added today, and now suddenly I don't care about cookies fails to update itself! Coincidence?

krystian3w commented 4 years ago

My point is that Kiboke has an overloaded web server and the browser displays the list as a text document in 3 minutes.

JustOff commented 4 years ago

Of course, I don't care about cookies ends with /, rather than \

Sorry, but you also don't get the essence of the https://github.com/gorhill/uBlock/commit/703c525b01aa3fb9dab94d6a9918a0a69c6d18da, it's about the filter content, not the address.

now suddenly I don't care about cookies fails to update itself!

Just try to open it in browser. Coincidences happen :smile:

THEtomaso commented 4 years ago

It opens just fine in my browser, but still fails in uBO!

krystian3w commented 4 years ago

Try in private tab / window open, maybe Nordic region have magic CDN-s...

JustOff commented 4 years ago

Please, calm down a little. www.i-dont-care-about-cookies.eu is having problems right now. The browser gets cached response from Cloudflare after 90 sec timeout, while uBO doesn't wait more then 60 sec. You can check this filter also fails to update using the mainstream uBO in Firefox and Chrome.

THEtomaso commented 4 years ago

I'm calm. :) It's weird though.. The list opens instantly for me, every single time, both in Pale Moon (also in private mode) and in Chromium-ungoogled. But it fails every single time in uBO!

THEtomaso commented 4 years ago

I've mirrored IDCAC on my own server. I'll have uBO fetching it from there for now, and go back to testing v1.16.4.20b2! :)

THEtomaso commented 4 years ago

OK, all filters (including custom ones) are updating just fine now. Both of the new advanced parameters seems to work like intended, and the converted rules are applied to sites correctly! Great work again, JustOff!! :+1:

JustOff commented 4 years ago

Thanks, but what can you say about browser / CPU load when updating all filters? I tend to consider it acceptable and enable the converter for all filters (except My Filters) without any options.

THEtomaso commented 4 years ago

I didn't exactly monitor it, but if there was a difference when updating, it sure wasn't noticable at my end. I've never used the auto-update option though, so I wouldn't even know if that particular function has a noticable effect, when browsing under normal circumstances.

JustOff commented 4 years ago

I rethought it all over and decided to make filter syntax converter process all filters except explicitly excluded.

I am going to exclude only the following filters, which currently do not change after applying the converter:

There will also be an advanced option assetConvertMyFilters to make the converter process My Filters (disabled by default).

THEtomaso commented 4 years ago

Sounds good! Although, why not leave the assetConvertAllRemote option in there too? I mean, since you've already created it. It won't affect performance, as long as it's disabled by default anyway! :)

JustOff commented 4 years ago

I carefully checked the filters that I'm going to exclude and I think that the probability that some of them in the future may require conversion is quite small. However, if this does happen, I would prefer it to be reported and corrected for everyone, rather than for only the one who discovered it.

THEtomaso commented 4 years ago

Fair enough. Really, the only problem I can foresee is if the AdGuard team suddenly decides to change their syntaxes too, like gorhill has done.

JustOff commented 4 years ago

In fact, AdGuard filters already use syntax that requires conversion and such filters are not excluded (AdGuard Base, AdGuard Mobile Ads, etc.). At the same time, AdGuard Tracking Protection and AdGuard Social Media, which are excluded, seem to use only simple syntax by the very nature of these filters. Though, if I'm wrong, this can always be corrected later.

krystian3w commented 4 years ago

AG no needed, They recently launched (a month ago?) xpath and nth-ancestor (I don't think they've started turning it into a upward() one yet).

In addition, when it comes to scriptlets, they have already made sure that long names are used instead of abbreviated ones.

THEtomaso commented 4 years ago

Well, let's just ask the source..

@BlazDT: Can you confirm that there are no immediate plans to change filter syntaxes in 'AdGuard Social Media filter' and 'AdGuard Tracking Protection filter'?

BlazDT commented 4 years ago

At least not that I am aware of. We rarely use javascript rules in Tracking and Social filter.

@ameshkov Or is anything planned I am probably not aware of (at least not found anything in filters chat)?

ameshkov commented 4 years ago

@DandelionSprout

converting from their own #%#AG_ and/or #%#//scriptlet into uBO's old syntax

Wait-wait-wait, what have I missed, the scriptlets syntax was changed again?

Regarding the original question:

Really, the only problem I can foresee is if the AdGuard team suddenly decides to change their syntaxes too, like gorhill has done.

I wholeheartedly hate backward-incompatible syntax changes that appear out of the blue.

JustOff commented 4 years ago

uBO itself perfectly understands both the new and old syntax, but uBO-filters have been converted to the new shorthand one. That is why uBO-legacy now has filter syntax converter, and here we discuss which filters from other sources also require it to be applied.

DandelionSprout commented 4 years ago

The change occured 1~2 years ago, so that e.g. abort-on-property-read.js could now be written as aopr. But it was only gradually introduced to the uAssets lists this winter.

Double-checking with my AdGuard for Windows installation, aopr is already converted into #%#//scriptlet('ubo-aopr.js'; at least for those lists that are included in that extension such as my Nordic list.

krystian3w commented 4 years ago

But it's not about "uBo-Firefox-legacy", how AG convert the short names in AG-extension / server side to works as an adguard scriplet filter.

If your record doesn't work: #%#//scriptlet('ubo-aopr.js';

then report it converted to a long name: #%#//scriptlet('ubo-abort-on-property-read.js';

or add anchors in the code (alias).

gwarser commented 3 years ago

Scriptlet for Twitch does not work because filter does not have .js ?

JustOff commented 3 years ago

@gwarser, sorry, but I didn't quite understand your question.

krystian3w commented 3 years ago

He asks - "uBO FX Legacy is running the filter from uBlock list correctly?"

! https://github.com/uBlockOrigin/uAssets/issues/719
twitch.tv##+js(twitch-videoad)

I do not rule out the possibility that some of the advertisements are leaking, because the uBO for FX Legacy did not get an updated version of the script:

https://github.com/gorhill/uBlock/pull/3781/files - maybe no compatible with Firefox 45 - 56 / Waterfox Classic.

https://github.com/gorhill/uBlock-for-firefox-legacy/blob/2a285b88be636b5a162c2465b7ea887e810d02bb/assets/resources/resources.txt#L2025

---url.searchParams.set('platform', '_');
+++url.searchParams.delete('platform');

few ads maybe use SSAI in Waterfox Classic, so these is out of scope for any uBO as nobody implement "replace=" on similar application layer as HTML filtering.

gwarser commented 3 years ago

One user on Reddit complained about ads on Twitch on Waterfox classic (or even older version). I tried adding custom scriptlet to uBO by advanced userResourcesLocation, but twitch.tv##+js(twitch-videoad) was not logged at all. Adding twitch.tv##+js(twitch-videoad.js) to My filters make it work.

krystian3w commented 3 years ago

"My filters" need use .js.

gwarser commented 3 years ago

Twitch scriptlet is not translated internally: https://github.com/gorhill/uBlock-for-firefox-legacy/blob/2a285b88be636b5a162c2465b7ea887e810d02bb/src/js/assets.js#L293-L380

gwarser commented 3 years ago

twitch-videoad.js is no longer used in uBO. Even specialized add-on is now failing. They fight hard.