Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
Currently, many of environment variables such as APIFY_DEFAULT_REQUEST_QUEUE_ID or APIFY_LOCAL_EMULATION_DIR are set by apify-cli and therefore to run SDK scripts without CLI, one must manually set those (often more than 5) env vars.
We should move the defaults to SDK so that users are not forced to use CLI or manually define the env vars.
Currently, many of environment variables such as
APIFY_DEFAULT_REQUEST_QUEUE_ID
orAPIFY_LOCAL_EMULATION_DIR
are set byapify-cli
and therefore to run SDK scripts without CLI, one must manually set those (often more than 5) env vars.We should move the defaults to SDK so that users are not forced to use CLI or manually define the env vars.