berstend / puppeteer-extra

💯 Teach puppeteer new tricks through plugins.
https://extra.community
MIT License
6.46k stars 741 forks source link

Puppeteer stealth mode still being detected by Datadom #182

Closed rafikahmed closed 4 years ago

rafikahmed commented 4 years ago

Hi there, Leboncoin.fr still detects Puppeteer as a bot. I analyzed all the requests and I found out that they use datadom which is also anti-bot service. I tried to also inject a custom user-agent request header as others suggested by still the same issue! Does anyone here have another workaround that I can maybe try ?

const userAgent = require("user-agents");
const puppeteer = require("puppeteer-extra");
const stealthPlugin = require("puppeteer-extra-plugin-stealth");
puppeteer.use(stealthPlugin());

puppeteer
  .launch({ headless: false })
  .then(async (browser) => {
    const page = await browser.newPage();
    await page.setViewport({ width: 800, height: 600 });
    await page.setUserAgent(userAgent.toString());
    await page.goto("https://www.leboncoin.fr/boutique/100524/minautor.html/");
    await page.waitFor(10000);
    await browser.close();
    console.log(`All done, check the screenshot. ✨`);
  })
  .catch((err) => console.log(err));
Capture
ejames17 commented 4 years ago

@rafikahmed im also having the same issue.

slvDev commented 4 years ago

Windows or macOS? I have an issue with detecting when using macOS, but when I run a script on Windows everything working fine. ps, it was a different site but still, I have different behavior on windows and mac.

rafikahmed commented 4 years ago

Hi @slvDev, I'm running windows, haven't tested it on Mac.

txrp0x9 commented 4 years ago

Have you tried checking it normally using chrome and incognito? Every time I use incognito w/ chrome I seem to run in you are blocked

ejames17 commented 4 years ago

@Torpedo99 one solution for me was to detect the js that gets loaded onto the page and abort the request as a work around.

Something like this

` await page.setRequestInterception(true);

    page.on('request', async (request) => {
        if(request.url() === `${url}/assets/iovation-first-third.js` ){

            log(`request url is ${request.url()}: aborting`, 'warn');
            return request.abort();
        }

}) `

txrp0x9 commented 4 years ago

@ejames17 yes, but in most cases, the fingerprinting script if aborted, would still result in a block.

azerpas commented 4 years ago

Not puppeteer related: Try to reverse-engineer their API from their App with Charles Proxy. Pretty much no security.

txrp0x9 commented 4 years ago

@rafikahmed I have figured a way to scrape https://www.leboncoin.fr/boutique/100524/minautor.html/.

When running puppeteer using puppeteer stealth. Do this: puppeteer.launch({ ignoreDefaultArgs: ["--enable-automation"] })

Remember to change your proxy before you test this since datadom flags an IP address and gives out You are blocked messages once you start connecting with fresh cookies.

ejames17 commented 4 years ago

@Torpedo99 ill have to try this method with another script. Thanks!

leadscloud commented 4 years ago

@Torpedo99 not work at macos,

i also have same issue in MacOS. headless: true, can't work. headless: false, work sucess. i have try all setting.

https://github.com/puppeteer/puppeteer/issues/665 same to this. No one solved it, the methods inside didn't work

berstend commented 4 years ago

This is most likely fixed, closing until further notice: https://github.com/berstend/puppeteer-extra/issues/213#issuecomment-678374178

foch01 commented 3 years ago

Unresolved problem on my side I have the same problem.

image

stagiaire-dev commented 2 years ago

Problème non résolu de mon côté j'ai le même problème.

image

Have you found a solution to share?

stagiaire-dev commented 2 years ago

Have you found a solution to share?

erwann-sh commented 1 year ago

Leboncoin.fr has very good protection against bots. For me, the solution was to control an existing web browser to make my bot interact with the UI based on the elements positions on the screen.

udiudiudiudi commented 1 year ago

yea laboncoin is using datadome protectoin, which is another french company,

combination of web view, saving working cookies + doing mouse movements in a bspline pattern (like normal human) - would defeat them

nmaisonneuve commented 1 year ago

@udiudiudiudi did you manage to bypass it / can you develop a bit ? I am interested / DM me @nmaisonneuve

albertlieyingadrian commented 1 year ago

Interested in the solution to this problem still!

duchnoun commented 1 year ago

seem that we can't bypass now , => explanation here : https://github.com/berstend/puppeteer-extra/issues/788

bazaned commented 6 months ago

hello, any news?