ghostery / adblocker

Efficient embeddable adblocker library
https://www.ghostery.com
Mozilla Public License 2.0
786 stars 99 forks source link

Adding multiple puppeteer blockers on single page object #4160

Open teammakdi opened 1 month ago

teammakdi commented 1 month ago

Does adding multiple puppeteer blockers on single puppeteer page object work

Example both

await PuppeteerBlocker.fromLists(fetch, [
    'https://secure.fanboy.co.nz/fanboy-annoyance.txt'
]).then(async (blocker) => {
    await blocker.enableBlockingInPage(page);
});

await PuppeteerBlocker.fromPrebuiltAdsOnly(fetch).then(async (blocker) => {
    await blocker.enableBlockingInPage(page);
});

Not sure if its similar issue as https://github.com/ghostery/adblocker/issues/4133

seia-soto commented 1 month ago

Hi @teammakdi ,

We don't support putting multiple blocker instances into the single page. In other words, we don't know about the side-effects of doing so.

Also, the "context" created by "PuppeteerBlocker" for the "page" will be discarded when you bind another blocker to the "page".

Best

teammakdi commented 1 month ago

@seia-soto Now since https://github.com/ghostery/adblocker/issues/4133 has been merged

import puppeteer from 'puppeteer'
import fetch from 'cross-fetch'
import { PuppeteerBlocker } from '@cliqz/adblocker-puppeteer'

const url = 'https://www.rightmove.co.uk'

const browser = await puppeteer.launch()
const page = await browser.newPage()

await page.setViewport({ width: 1080, height: 1024 })

const adsAndTrackingEngine = await PuppeteerBlocker.fromPrebuiltAdsAndTracking(fetch);

const cookieEngine = await PuppeteerBlocker.fromLists(fetch, [
  'https://secure.fanboy.co.nz/fanboy-cookiemonster.txt',
  'https://raw.githubusercontent.com/cliqz-oss/adblocker/master/packages/adblocker/assets/easylist/easylist-cookie.txt',
  'https://raw.githubusercontent.com/cliqz-oss/adblocker/master/packages/adblocker/assets/ublock-origin/annoyances-cookies.txt'
])

const mergedEngine = PuppeteerBlocker.merge([adsAndTrackingEngine, cookieEngine]);

await mergedEngine.enableBlockingInPage(page)

await page.goto(url)

await page.screenshot({
  path: 'screenshot.jpg'
})

await browser.close()

should work as expected right.