gildas-lormeau / single-file-cli

CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
GNU Affero General Public License v3.0
653 stars 63 forks source link

Implement native stealth mode #57

Open DeployThemAll opened 1 year ago

DeployThemAll commented 1 year ago

I'm trying to create a script to open few pages using puppeteer in stealth mode

I managed to add single-file extension while opening the browser page. but I can't send a request to the extension to start saving the current tab.

I'm doing this because I couldn't download some pages with cli without a stealth mode which I don't know to implement it in case of single-file.

I wish single-file has a native stealth mode implemented.

gildas-lormeau commented 1 year ago

Sorry for the late answer, I guess the simplest solution would be to merge the code of https://github.com/gildas-lormeau/single-file-cli in your project.

DeployThemAll commented 1 year ago

No need to apologize, You are really doing a great job updating and managing this repo so far.

I'm trying to download it with Cli as well but I alway get hit with Network checking detection.

https://github.com/gildas-lormeau/single-file-cli/issues/54

You can try to download the url using the cli and see what I mean.

If Single file could integrate stealth plugin as an option while using puppeteer as a backend would be an amazing update.

I'm open for other solution as well.

DeployThemAll commented 1 year ago

Hey, Are we expecting Native Stealth Mode soon?

gildas-lormeau commented 8 months ago

You can now pass easily the path to your user data folder with the switch --browser-arg, e.g. ./single-file https://www.example.com --browser-arg="--user-data-dir=/Users/<username>/Library/Application Support/Google/Chrome" (on macOS).

Is this feature sufficient to bypass CloudFlare and co?

pirate commented 3 months ago

I've had success using a Chrome Profile with logged in accounts, passed to single-file-cli using --browser-args=... to get around Cloudflare and other CAPTCHA-gated sites.

You can also install it as an extension in Chrome, connect manually to chrome with puppeteer + puppeteer-extra-plugin-stealth, then simulate clicking Singlefile extension icon to trigger it with all the stealth stuff applied:

const singlefile_ctx = ... // get the SingleFile service worker/background.js context

await singlefile_ctx.evaluate(async (tab) => {
    // Pass a specific tab, or fallback to currently focused foreground tab
    tab = tab || (await new Promise((resolve) => 
        chrome.tabs.query({currentWindow: true, active: true}, ([tab]) => resolve(tab))))

    // Simulate clicking Singlefile extension icon in menubar
    await chrome.action.onClicked.dispatch(tab);
    // Singlefile activates and saves its output to ~/Downloads/singlefile.html
    // https://developer.chrome.com/docs/extensions/reference/api/action#event-onClicked
}, tab);