uBlockOrigin / uBlock-issues

This is the community-maintained issue tracker for uBlock Origin
https://github.com/gorhill/uBlock
924 stars 77 forks source link

uBO 1.49 in Chromium doesn't seem able to handle case-insensitive hiding rules well #2631

Closed DandelionSprout closed 1 year ago

DandelionSprout commented 1 year ago

Prerequisites

I tried to reproduce the issue when...

Description

I began noticing in mid-April that case-insensitive attribute values no longer worked correctly in uBO in Chromium browsers (whereas they worked before, and still works in "AdGuard for Chromium" and in uBO in Firefox). For instance deviantart.com##a[data-hook][href*="/art/"][href*="-adopt-" i].

My perception of the situation is that uBO's methods to hide elements, are treating attribute values as case-sensitive even if they have the i at the end to mark case-insensitivity. The logger, on the other hand, correctly treats them as case-insensitive, leading to that going to the logger → </> shows the elements that were supposed to be hidden, as having a red tint over them to show that the logger treats them as hidden.

Pasting case-insensitive entries into the element picker and clicking "Choose", does at times seem to correctly hide the elements case-insensitively, though simply pasting hundreds of entries into "My Filters" only works inconsistently at best.

I incorrectly believed in the past 2 weeks that this was a Chromium engine update bug, but I'm now 80% confident that it's a regression in uBO 1.49 or 1.48.

A specific URL where the issue occurs.

https://www.deviantart.com/search?q=adopt

Steps to Reproduce

1) Add deviantart.com##a[data-hook][href*="/art/"][href*="-adopt-" i] to a custom filterlist, then add the custom filterlist to uBO. 2) Visit deviantART and search for "adopt". 3) See that results whose URLs match "Adopt" or "ADOPT" are not blocked.

Expected behavior

deviantart.com##a[data-hook][href*="/art/"][href*="-adopt-" i] blocks deviantART search results whose URLs case-insensitively match -adopt-.

Actual behavior

deviantart.com##a[data-hook][href*="/art/"][href*="-adopt-" i] either only case-sensitively block deviantART search results whose URLs match -adopt-, or fails to block anything at all (The exact outcome seems to differ depending on unknown causes).

uBO version

1.49.2

Browser name and version

Chrome 112.0.5615.138 x64, as well as Vivaldi 6.0.2979.18 x64

Operating System and version

Windows 11 22H2 x64

gorhill commented 1 year ago

I cannot reproduce, I get 4 elements hidden with your cosmetic filter at the given webpage:

Screenshot from 2023-05-03 18-30-09

DandelionSprout commented 1 year ago

This is what the results are supposed to be (Tested in AdGuard):

Corrected image: ![image](https://user-images.githubusercontent.com/22780683/236066527-78fa7046-6ed5-4aca-90eb-f9c33d7cc9a5.png)

Whereas these are the results on my end with uBO 1.49.2:

![image](https://user-images.githubusercontent.com/22780683/236065809-2a623d40-3557-4b18-aee3-5e6464c97643.png)
gorhill commented 1 year ago

I get the same result as uBO with AdGuard and your filter, at that webpage.

Furthermore, the dev console returns exactly the same elements as uBO's targeted elements:

> 18:44:23.508 document.querySelectorAll('a[data-hook][href*="/art/"][href*="-adopt-" i]');
> 18:44:23.530 NodeList(4) [a, a.uU5En, a, a.uU5En]

In uBO, it's a declarative cosmetic filter, the browser is doing the work once uBO injects the cosmetic filter as a CSS rule.

gorhill commented 1 year ago

The proper way to investigate this is not to show screenshots, it's to dig to find an exact element which you think should have been targeted by the filter while it is not -- show me the HTML code of the element(s) which you say should have been targeted.

The browser's own dev tools confirm uBO properly targeted the matching elements, which is expected since it's the browser doing the work in the end when a cosmetic filter is declarative.

DandelionSprout commented 1 year ago

Doing additional research now, I'm wondering if me using https://raw.githubusercontent.com/DandelionSprout/adfilt/master/a.txt and https://raw.githubusercontent.com/DandelionSprout/adfilt/master/AntiCartoonHipsterList.txt (which combined have 3,300 specific entries for deviantART) could be hitting some kind of max cache limit; though I need to squeeze in a nap before a café job in 7 hours. I promise I'll continue looking into the case within 24 hours.

gorhill commented 1 year ago

I collated all the a[data-hook].href on that page and this is what I get:

all a[data-hook].href

Only four match -adopt- in a case-insensitive way.

DandelionSprout commented 1 year ago

I seem to have found a fix for it today, but have been unsure how to word it, since it seems to make extensive debugging no longer all that necessary, and that the fix and cause sounds odd:

The problem turned out to not be with the deviantart.com##a[data-hook][href*="/art/"][href*=-adopt- i] syntax entries I used, but rather with the deviantart.com#?#div[style^="width:"][style*="display:"]:has(a[href*="/art/"][href*=-adopt- i]) syntaxing that I used in the same lists, the latter of which somehow seemed to break hiding in such a way that neither syntax worked as intended.

Given the extreme rarity of this situation, especially so for filterlist makers, I chose to convert the latter syntax into singular very long entries like https://github.com/DandelionSprout/adfilt/blob/bad553effe8b3581b9a7083c9b3ab85b290ef5bf/DeviantARTQualityArtMagnifier.txt#L11, which somehow managed to fix the problem, despite me never being sure what the exact cause was (My loose bet is on cache problems).

gorhill commented 1 year ago

[href*=-adopt- i]

You are missing quotes: [href*="-adopt-" i]


Doesn't seem to make a difference anyway.

gorhill commented 1 year ago

Closing as unable to reproduce since I was not able to reproduce an issue.