jxlil / scrapy-impersonate

Scrapy download handler that can impersonate browser' TLS signatures or JA3 fingerprints.
MIT License
78 stars 9 forks source link

Fix multi-header compatibility with Scrapy #14

Closed gg closed 1 month ago

gg commented 1 month ago

ImpersonateDownloadHandler._download_request() currently returns a scrapy.http.Response where multiple headers with the same key are concatenated into a single comma separated value. This means that for example, response.headers.getlist('Set-Cookie') returns a list with 1 item even if multiple Set-Cookie headers were present in the HTTP response.

To be compatible with Scrapy's built-in HTTP11DownloadHandler and H2DownloadHandler, response.headers.getlist(key) should return multiple values when multiple headers with the same key are present in the HTTP response.

To fix this, we can pass curl_cffi.requests.Headers.multi_items() to Scrapy's Headers() constructor.

jxlil commented 1 month ago

Thanks!