apify / fingerprint-suite

Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.
Apache License 2.0
899 stars 94 forks source link

Browser hangs when navigating on shopee.com.my when overriding codecs #246

Closed marcplouhinec closed 8 months ago

marcplouhinec commented 10 months ago

Describe the bug When navigating on shopee.com.my with Playwright and Firefox or Chromium, the browser hangs as if the event loop is not running anymore (or very slowly).

To Reproduce Here is my test script:

import {firefox} from "playwright";
import {newInjectedContext} from "fingerprint-injector";

(async () => {
    console.log('Create a new browser...');
    const browser = await firefox.launch({ // same behavior with chromium
        headless: false,
        slowMo: 0
    });
    const context = await newInjectedContext(browser, {});
    const page = await context.newPage();
    // const page = await browser.newPage(); // works well without fingerprint injection

    console.log('Open a page...');
    await page.goto(`https://shopee.com.my/search?keyword=nike`, {timeout: 20000});

    console.log('Wait and close the browser...');
    await page.waitForTimeout(60000);
    await page.close();
    await browser.close();

})().catch(e => {
    console.log('Error: %s', e);
});

After few seconds, the hover effect on the language popup buttons doesn't work anymore, and HTTP resources are not downloaded anymore.

Expected behavior The browser should behave normally, like without fingerprint injection.

System information:

Additional context I have investigated a bit and found a workaround: In overrideCodecs/findCodec in utils.js the bug is caused because codecSpec is undefined, so I just modified the findCodec function like this:

const findCodec = (codecString) => {
    const [mime, codecSpec] = codecString.split(';'); // codecSpec can be undefined
    if (mime === 'video/mp4') {
        if (codecSpec && codecSpec.includes('avc1.42E01E')) { // I add `codecSpec && ` to avoid the problem
            return {name: mime, state: 'probably'};
        }
    }

    // ...
};
marcplouhinec commented 10 months ago

In the same vein, I had to modify overrideIntlAPI in utils.js as the key argument is sometimes a Symbol instead of a string:

function overrideIntlAPI(language){
    try {
        // ...

        overridePropertyWithProxy(window, 'Intl', {
            get(target, key){
                if(typeof key !== 'string' || key[0].toLowerCase() === key[0]) return target[key]; // Add a check in case key is not a string
                return new Proxy(
                    target[key],
                    innerHandler
                );
            }
        });
    } catch (e) {
        // ...
    }
}

This caused the website to detect my scraper and redirect me to the login page. Now I can scrap data with Puppeteer 21.5.0 (Playwright is detected).

barjin commented 9 months ago

Hello @marcplouhinec and thank you for your interest in this project (and sorry for the wait)!

Both changes seem reasonable - a PR is welcome :) In case you haven't done this before - you fork this project, make the changes, and then create a PR from your project against this one. This way, you can still make a pull request to this repository without being a member of the Apify GitHub organization.

Cheers!

thangman22 commented 9 months ago

@marcplouhinec Sorry to interrupt your contribution but I got this bug too @barjin please merge my PR

marcplouhinec commented 8 months ago

I just submitted the pull request #257. It is very similar to the one of @thangman22 but I limited changes as much as possible (no code reformatting).