apify / crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
https://apify.github.io/crawlee-python/
Apache License 2.0
27 stars 1 forks source link

feat: add support decompress *br* response content #226

Closed Mantisus closed 6 days ago

Mantisus commented 6 days ago

Description

Adds support for decompressing a brotli-compressed server response if an "Accept-Encoding" header was passed with the "br" or "gzip, deflate, br" parameter and the server supports this type of compression.

Httpx provides the necessary functionality, but requires additional libraries to be installed to work correctly.