apify / crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
https://crawlee.dev/python/
Apache License 2.0
4.65k stars 319 forks source link

Does crawlee-python has a task distributed or manage system like scrapyd and scrapyd-web? #677

Closed QThans closed 2 weeks ago