apify / crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
https://crawlee.dev
Apache License 2.0
15.48k stars 664 forks source link

Connect to remote browser services #1822

Open vimagick opened 1 year ago

vimagick commented 1 year ago

Which package is the feature request for? If unsure which one to select, leave blank

@crawlee/browser (BrowserCrawler)

Feature

There are cloud browser services like Browserless. So that we can use remote browsers to run our automation tasks.

Motivation

Allows for remote programs to connect, pilot, and execute headless browser tasks

Ideal solution or implementation, and any additional constraints

Add a connectOptions similar to the launchOptions

Alternative solutions or implementations

No response

Other context

No response

brunapereira commented 10 months ago

Any updates on this thread?

JorritKeijzer commented 9 months ago

Bumping this

spaceworkplatform commented 2 months ago

Anything new about this ? can be super helpful for scale to connect for a remote chrome.