apify / crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

https://crawlee.dev/python/

Apache License 2.0

4.02k stars 254 forks source link

fix: handle blocked request #234

Closed Mantisus closed 3 months ago

Mantisus commented 3 months ago

Description

Replaced find to select_one because find does not support CSS selectors, which caused incorrect behavior.

apify / crawlee-python

fix: handle blocked request #234

Description

Related issues

230