yifeikong / curl_cffi

Python binding for curl-impersonate via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
https://curl-cffi.readthedocs.io/
MIT License
1.66k stars 214 forks source link

[BUG] Failed to perform, curl: (23) when requesting Brave search #320

Closed oleg-the-dev closed 1 month ago

oleg-the-dev commented 1 month ago

Describe the bug I'm trying to request Brave search, but it gives me Failed to perform, curl: (23) error whether i use sync or async.

To Reproduce

from curl_cffi import requests

# This gives me an error
r = requests.get("https://search.brave.com/search", params={"q": "cats"}, impersonate="chrome")

# This doesn't
r = requests.get("https://search.brave.com/search", impersonate="chrome")

###############################################################
# Same with async

import asyncio
from curl_cffi.requests import AsyncSession

async def main():
    async with AsyncSession() as session:
        r = await session.get("https://search.brave.com/search", params={"q": "cats"})
    return r

asyncio.run(main())

Versions

yifeikong commented 1 month ago

Brave is not following the RFCs. There is no Content-Length response header when Accept-Encoding is set to gzip or br. The browsers may be tolerant to this ill-formed response, but unfortunately, curl is not.

To bypass this, simply remove the Accept-Encoding header.

r = requests.get("https://search.brave.com/search", params={"q": "cats"}, headers={"Accept-Encoding": ""}, impersonate="chrome120")
yifeikong commented 1 month ago

Related curl issue: https://github.com/curl/curl/issues/5200 and CF discussion: https://community.cloudflare.com/t/missing-content-length-in-http-headers-for-some-urls/571349