Open Yuki2718 opened 1 year ago
Probably only :contains
will be affected in AG lists as I don't see :xpath
usage, but if :xpath
is used escape won't work and needs to be wrapped by another quote.
CC @slavaleleka
@Alex-302 It's totally different. The PR was about an uBO issue that will eventually be fixed but the problem here is permanent change of uBO.
BTW AG has been compatible with most of uBO syntax, so I wonder if it's possible for AG to be compatible with has-text
with escape in some way.
@Yuki2718 FYI: We have created the ECSSTree library, which is fully backwards compatible with basic CSSTree, but thanks to re-tokenization it can parse the edge case you mentioned and many other cases as well. You can find it at the following link: https://github.com/AdguardTeam/ecsstree
Related: https://github.com/uBlockOrigin/uBlock-issues/discussions/2328
Also a little outdated is escape method, now works only in uBo:
example.org#?##A > .B:has(> .C:contains("This Week's Circulars")) ! ! shorter fake selectors to avoid horizontal scroll ! example.com#?##D:contains('French "fo paux" is abused')
to avoid regex syntax for that as long uBo consider switch to ECSSTree.
Same protection needed orphaned [ ] { } ( )
if was copied to :contains()
(when it is unfavorable to copy a long string so that the parenthesis is always closed).
Same protection needed orphaned
( )
I'm curious if ECSSTree is used to compile filters in the "/ublock/" directory - if that is, the AdGuard filter problem fixed outside of FiltersCompiler
/ coreLibs
(it is possible that here almost never) / tsurlfilter
/ FiltersRegistry
.
Quickly I can no longer find any filters what would have '
"
in contains (simple regex can fail for contains with orphaned bracket).
(I did the same myself with one detection of incompatible filter - for me it was enough to check less characters)
The one from the demo I think someone has already rewritten/removed by hand (But can be cumbersome for new cases).
AdGuard filter problem
I don't think this is a "problem" of AdGuard filters only, but a more general problem, since this syntax is used in all adblockers, as far as I know. It's simply that CSSTree's tokenizer doesn't support this syntax (for obvious reasons). CSSTree is an excellent library for various purposes, I have been experimenting with it since June 2022, that's why ECSSTree was created to solve these annoying edge cases. I tested ECSSTree on the entire AG Filters Registry and the entire uAssets, so I handled several edge cases. We will use ECSSTree in a static analyzer tool, which should parse all types of adblock rules, but I think it can be used for any other purpose as long as it helps working with Extended CSS language elements.
Due to new cosmetic filter parser, uBO requires quotes in arguments to be escapsed: https://github.com/gorhill/uBlock/commit/a71b71e4c8a2037fc68970bc8912a76732edaade
An example that needs the fix is
thetimestribune.com#?##tncms-region-option-rail-top-one > .tncms-block:has(> .block-title:contains(This Week's Circulars))
in Base filter.