uBlockOrigin / uAssets

Resources for uBlock Origin, uMatrix: static filter lists, ready-to-use rulesets, etc.
GNU General Public License v3.0
4.13k stars 770 forks source link

Colombia ads - Missed ads #247

Closed gotitbro closed 5 years ago

gotitbro commented 7 years ago

I am seeing "Recommended by Colombia" (recommended content) ads on websites which use the Colombia Ad network (https://www.colombiaonline.com/). These ads can be hidden by: indiatimes.com##STYLE[type="text/css"] + [class] but a lot of resources are used in loading the ads.

Example URLs: http://timesofindia.indiatimes.com/us-elections-2016/Obama-tells-Latin-America-and-world-Give-Trump-time-dont-assume-worst/articleshow/55523627.cms http://www.in.techradar.com/news/misc/your-smart-samsung-tv-is-about-to-get-a-lot-more-personalized/articleshow/56204222.cms http://www.idiva.com/news-work-life/things-you-should-definitely-say-yes-to-in/16122235

SS: Ads appear in the sidebar and below the article (cannot post screenshot due to fragmentation of image).

The ads don't seem to be loaded externally and the script seems to be embedded in the webpage itself. Further discussion: https://forums.lanik.us/viewtopic.php?f=62&t=34718&p=110665

Versions:

gorhill commented 7 years ago

Does this work?

||clmbtech.com^
uBlock-user commented 7 years ago

No, infact blocking 3rd party content and inline script won't stop it either from appearing.

gorhill commented 7 years ago

I noticed this afterward, the site use a cookie to not always serve the Columbia ads, this made me think the filter above worked.

I came up with this:

timesofindia.indiatimes.com##.main-content > div[class]:matches-css(background-image: /timesofindia\.indiatimes\.com/toiitpic/commons/images/rbc-/)
timesofindia.indiatimes.com##.sidebar > .ctn_ads_rhs > div[class]

First one is uBO-specific, second one should also work for ABP.

uBlock-user commented 7 years ago

How are they able to load even after inline-script and first party scripts are blocked ?

gorhill commented 7 years ago

They are not loaded after, they are served as part of the document by the server.

uBlock-user commented 7 years ago

So basically the script is fragmented and embedded as a part of the document and no longer a standalone js.

gorhill commented 7 years ago

The HTML tags making up the "Recommended by Columbia" are embedded in the document, no need for javascript. Easy to find out, just put view-source:http://timesofindia.indiatimes.com/us-elections-2016/Obama-tells-Latin-America-and-world-Give-Trump-time-dont-assume-worst/articleshow/55523627.cms in address bar and search for "columbia" (inline javascript only makes them clickable).

gorhill commented 7 years ago

Actually the filter suggested filter on EasyList forum works well, the random-class div containing the ads are pretty much always preceded by a style tag:

indiatimes.com##style[type="text/css"] + div[class]

So no need for any extended uBO syntax.

gotitbro commented 7 years ago

@gorhill As I posted in the first post, the above cosmetic filter works. I started this issue as I saw a lot of data being spent to load the ad images but as you posted above these ads can't seem to be blocked and the best we can do for now is hide them.

gotitbro commented 7 years ago

@gorhill @okiehsch The the filter suggested filter on the EasyList forum does not seem to be working anymore: ##style[type="text/css"] + div[class]

I am again seeing the "Recommended by Colombia" ads.

Error URL: http://timesofindia.indiatimes.com/world/europe/polish-parliament-adopts-contested-supreme-court-reform/articleshow/59686535.cms

okiehsch commented 7 years ago

The first one suggested by gorhill still works for me

timesofindia.indiatimes.com##.main-content > div[class]:matches-css(background-image: /timesofindia\.indiatimes\.com/toiitpic/commons/images/rbc-/)
gotitbro commented 7 years ago

@okiehsch That does work for the ads in the articles but these Colombia ads are displayed all over this website without that background image and they do not get hidden. Like on the homepage: http://timesofindia.indiatimes.com

SS: https://i.imgur.com/i3r2THZ.png

The same is true for ads on other websites in this ad network (owned by the publisher of these websites) which are displayed without that background picture. For example: http://navbharattimes.indiatimes.com/world/rest-of-europe/head-of-french-military-quits-after-row-with-emmanuel-macron/articleshow/59678633.cms

okiehsch commented 7 years ago

@gotitbro this

navbharattimes.indiatimes.com##.leftmain > div[class]:matches-css(background-image: /navbharattimes\.indiatimes\.com/nbige/commons/images/rbc-/)
navbharattimes.indiatimes.com##.parentRgtSuperhtSec > div[class]:matches-css(background-image: /navbharattimes\.indiatimes\.com/nbige/commons/images/rbc-/)

seems to work at http://navbharattimes.indiatimes.com/world/rest-of-europe/head-of-french-military-quits-after-row-with-emmanuel-macron/articleshow/59678633.cms on my end, I hope, as I can't read hindi.

gotitbro commented 7 years ago

@okiehsch Thanks, those filters do work for the ads in the articles, the ads on the homepage are still visible: http://navbharattimes.indiatimes.com

The previous filter covered all these ads on the different websites of the publisher. A similar solution would be really beneficial here. There are quite a few websites using these ads: https://github.com/AdguardTeam/AdguardFilters/issues/4787

gotitbro commented 7 years ago

@gorhill Do you plan to fix this in uBO filters?

gotitbro commented 7 years ago

@gorhill Seanl fixed this in Adguard Filters and it seems to work for the most part:

adageindia.in,bombaytimes.com,businessinsider.in,gizmodo.in,iamgujarat.com,in.techradar.com,lifehacker.co.in,mensxp.com,indiatimes.com,samayam.com,idiva.com#%#Object.defineProperty(window,'trev',{get:function(){return function(){var a=document.currentScript;if(!a){var c=document.getElementsByTagName('script');a=c[c.length-1]}if(a&&/typeof\sotab\s==\s'function'/.test(a.textContent)){var d=a.previousSibling,b=d;while(b=b.previousSibling)if(b.nodeType==Node.COMMENT_NODE&&/\d{5,}\s\d{1,2}/.test(b.data)){d.style.setProperty('display','none','important');return}}}},set:function(){}});

Can something similar be done for uBO?

gotitbro commented 7 years ago

@gorhill Sorry to ping yet again. It would be really nice to hear something back from you :)

gorhill commented 7 years ago

How about:

timesofindia.indiatimes.com##div[class] > h2 + div:has(a[onclick][rel="nofollow,noindex"]):xpath(parent::div[following-sibling::script[contains(text(),"coldetect")]])
gotitbro commented 7 years ago

@gorhill That does not seem to be working.

gorhill commented 7 years ago

It "does not seem to be working" at what exact URL?

gotitbro commented 7 years ago

@gorhill The filter that you posted initially on this thread hides ads on news article pages: timesofindia.indiatimes.com##.main-content > div[class]:matches-css(background-image: /timesofindia\.indiatimes\.com/toiitpic/commons/images/rbc-/)

But it does not hide the ads in other parts of the website such as the homepage and other sections. The same is the case with the filter you posted recently: timesofindia.indiatimes.com##div[class] > h2 + div:has(a[onclick][rel="nofollow,noindex"]):xpath(parent::div[following-sibling::script[contains(text(),"coldetect")]])

SS (Fullpage SS showing ads, marked): http://i.imgur.com/zOTWCl4.jpg (Homepage) http://i.imgur.com/yKPgv5j.jpg (City Section)

Error URLs (Respective to the above SS):

http://timesofindia.indiatimes.com/
http://timesofindia.indiatimes.com/city
gotitbro commented 7 years ago

@gorhill Any update on this issue? I have posted the details on this above.

gorhill commented 7 years ago

They changed their layout after I posted my suggested filter, I will need to investigate again, I didn't have time yet.

Ideally it would be good to have more volunteers to craft filters. With India being such a big country, I puzzle that there is no EasyList India (or something like it) with its own team of volunteer maintainers at this point.

gotitbro commented 7 years ago

I puzzle that there is no EasyList India

@gorhill That is an interesting question, though most Indian websites, of which a majority are in English, even the regional ones are covered by EasyList.

This issue we are dealing with here seems to be an exception as the ad company "Colombia Audience Network" is itself owned by the publisher of the above mentioned websites in question.

They changed their layout after I posted my suggested filter

Interesting, this would mean they are keeping an eye here.

gorhill commented 7 years ago

this would mean they are keeping an eye here

That was my thought. These ads are inlined, so only cosmetic filtering can take care of them, and procedural ones at that by the way they are positioned in the page.

gotitbro commented 7 years ago

@gorhill Do you have any update on this issue?

jspenguin2017 commented 7 years ago

I don't think there is a way to block ads here there without some image recognition. They can break whatever solution we come up with. I was using a getter hook to trigger a DOM scan and it was working beautifully yesterday and today I see some ads don't have the hook trigger anymore. I am thinking about a DOM scanner on a setInterval, but I don't want to dump performance down the sink neither... And even if we don't care about efficiency, they can easily change the signature to bypass the scanner. Whatever signature we grab on they can bypass, we need a way to recognize the content as ads, and that is quite difficult to do with hard coded logic.

captn3m0 commented 7 years ago

This is what I've thrown in my .js repo to hide the ads:

$("h2:contains('PROMOTED STORIES')").parent().remove();

Tested at http://www.indiatimes.com/technology/science-and-future/this-soft-robot-is-made-of-lego-like-pieces-gives-you-flexibility-to-build-anything-you-want-329132.html

Code: JS, CSS

Not a uBlock based solution, but I thought I'd post here to see if they go ahead and convert "promoted stories" to an image as well.

jspenguin2017 commented 7 years ago

As you scroll, more contents are loaded, so you need to keep scanning it...

okiehsch commented 6 years ago

https://github.com/NanoAdblocker/NanoFilters/issues/57

jspenguin2017 commented 6 years ago

The script rule for reference:

```js if (a.domCmp(["adageindia.in", "bombaytimes.com", "businessinsider.in", "gizmodo.in", "iamgujarat.com", "idiva.com", "in.techradar.com", "indiatimes.com", "timesofindia.com", "lifehacker.co.in", "mensxp.com", "samayam.com", "gadgetsnow.com"])) { //https://gitlab.com/xuhaiyang1234/uBlockProtectorSecretIssues/issues/8 a.inject(() => { "use strict"; const magic = "a" + window.Math.random().toString(36).substring(2); const reScript = /typeof otab == 'function'/; const reComment = /\d{5,} \d{1,2}/; const getter = () => { let script; { let temp = [...window.document.querySelectorAll(`script:not([src]):not([${magic}])`)]; if (window.document.currentScript && !window.document.currentScript.hasAttribute(magic)) { temp.unshift(window.document.currentScript); } if (!temp.length) { return true; } for (let i = 0; i < temp.length; i++) { temp[i].setAttribute(magic, 1); if (reScript.test(temp[i].textContent)) { script = temp[i]; break; } } } if (!script) { return true; } { const previous = script.previousSibling; let temp = previous; while (temp = temp.previousSibling) { if (temp.nodeType === window.Node.COMMENT_NODE && reComment.test(temp.data)) { previous.style.setProperty("display", "none", "important"); return false; } } } }; window.Object.defineProperty(window, "trev", { configurable: false, set() { }, get() { let r; let i = 0; do { try { r = getter(); } catch (err) { } } while (!r && i++ < 100); return null; }, }); window.addEventListener("load", () => { void window.trev; }); }); let isInBackground = false; const reStart = /^\/[a-z_]+\.cms/; const reEnd = /^ \d{5,} \d{1,2} $/; const adsHidder = () => { if (isInBackground || !document.body) { return; } let iterator = document.createTreeWalker(document.body, NodeFilter.SHOW_COMMENT); let comment; while (comment = iterator.nextNode()) { if (reStart.test(comment.data)) { let toHide = []; let previous = comment; while (previous = previous.previousSibling) { if (previous.nodeType === Node.COMMENT_NODE && reEnd.test(previous.data)) { if (toHide.length < 15) { for (let i = 0; i < toHide.length; i++) { try { toHide[i].style.setProperty("display", "none", "important"); } catch (err) { } } } break; } toHide.push(previous); } } } }; setInterval(adsHidder, 1000); //@pragma-if-debug //a.setBenchmarkedInterval(adsHidder, 1000); //@pragma-end-if a.on("focus", () => { isInBackground = false; }); a.on("blur", () => { isInBackground = true; }); } ```
ghajini commented 6 years ago

a PR for this would help a lot @jspenguin2017

jspenguin2017 commented 6 years ago

I can PR if @gorhill accepts it. I think he'll shake his head on the setInterval.

gotitbro commented 6 years ago

@okiehsch Any plan on incorporating the Nano solution to the uBO filters?

okiehsch commented 6 years ago

@gotitbro jspenguin2017 already pinged gorhill, it's his decision.

gorhill commented 6 years ago

How would a scriptlet look like? I would like to try it first on my side.

jspenguin2017 commented 6 years ago

It would look the same, just in ES5. a.inject is not needed because scriptlets are already in page scope. a.on is just mapped to window.addEventListener.

gotitbro commented 6 years ago

@gorhill Can the Nano fix be added to uBO? Would be great if this issue can be solved.

jspenguin2017 commented 6 years ago

uBO probably won't have it because it has a measurable performance impact.

I'm not too worried about that as Nano is intended to be for computers only (not for cell phones). I'm rolling an experimental port of the solution in ND: https://github.com/NanoAdblocker/NanoFilters/commit/4f4de6c47d1fde0bc1f2d78ea02fb97e484c7cb5 https://github.com/NanoAdblocker/NanoFilters/commit/df8a96f5dc990b628178b7f20883a2abe228844f

jspenguin2017 commented 6 years ago

Can someone test whether this works? https://github.com/NanoAdblocker/NanoFilters/blob/master/NanoFiltersSource/NanoResources.txt#L67

mapx- commented 5 years ago