lexiforest / curl_cffi

Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
https://curl-cffi.readthedocs.io/
MIT License
2.49k stars 265 forks source link

[BUG + Feature] Some issues with custom impersonation #355

Closed novitae closed 3 months ago

novitae commented 4 months ago

Hey, I tried today the custom impersonation #15, and so far it's really good, but I have notice some small issues / missing things.

I am trying to impersonate the latest version of safari on macOS (19618.2.12.11.6), as a test, using https://tls.peet.ws/api/all.

Here is my script, and I am comparing the result I am getting in safari with the result json ```py from curl_cffi import requests from curl_cffi import const import ujson response = requests.request( "GET", "https://tls.peet.ws/api/all", headers={ "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "sec-fetch-site": "same-origin", "sec-fetch-dest": "document", "accept-language": "fr-FR,fr;q=0.9", "sec-fetch-mode": "navigate", "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.5 Safari/605.1.15", "referer": "https://tls.peet.ws/", "accept-encoding": "gzip, deflate, br" }, extra_fp=requests.impersonate.ExtraFingerprints( tls_min_version=const.CurlSslVersion.TLSv1_0, tls_grease=True, tls_cert_compression="zlib", tls_signature_algorithms=[ "ecdsa_secp256r1_sha256", "rsa_pss_rsae_sha256", "rsa_pkcs1_sha256", "ecdsa_secp384r1_sha384", "ecdsa_sha1", "rsa_pss_rsae_sha384", "rsa_pss_rsae_sha384", "rsa_pkcs1_sha384", "rsa_pss_rsae_sha512", "rsa_pkcs1_sha512", "rsa_pkcs1_sha1" ], http2_stream_weight=255, http2_stream_exclusive=0, ), ja3="771,4865-4866-4867-49196-49195-52393-49200-49199-52392-49162-49161-49172-49171-157-156-53-47-49160-49170-10,0-23-65281-10-11-16-5-13-18-51-45-43-27-21,29-23-24-25,0", akamai="2:0,4:4194304,3:100|10485760|0|m,s,p,a", http_version=const.CurlHttpVersion.V2_0, ) with open("resp.json", "w") as write: ujson.dump(response.json(), write, indent=2, escape_forward_slashes=False) ```

1. The first thing happening to me is one of the extensions not being supported:

File /opt/homebrew/lib/python3.11/site-packages/curl_cffi/requests/session.py:257, in BaseSession._set_ja3_options(self, curl, ja3, permute)
    [255](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/curl_cffi/requests/session.py:255) for cipher in ciphers.split("-"):
    [256](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/curl_cffi/requests/session.py:256)     cipher_id = int(cipher)
--> [257](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/curl_cffi/requests/session.py:257)     cipher_name = TLS_CIPHER_NAME_MAP[cipher_id]
    [258](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/curl_cffi/requests/session.py:258)     cipher_names.append(cipher_name)
    [260](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/curl_cffi/requests/session.py:260) curl.setopt(CurlOpt.SSL_CIPHER_LIST, ":".join(cipher_names))

KeyError: 49160

I think it would be nice to put at https://github.com/yifeikong/curl_cffi/blob/d1e55dd057aacc0e242d748d15ba3cb5a0ee8a84/curl_cffi/requests/session.py#L254-L258 the following code, in order to know what extension is missing:

        cipher_names = []
        for cipher in ciphers.split("-"):
            cipher_id = int(cipher)
            if cipher_id in TLS_CIPHER_NAME_MAP:
                cipher_names.append(TLS_CIPHER_NAME_MAP[cipher_id])
            else:
                raise KeyError(f'cipher "{hex(cipher_id)}" not currently supported')

In my case, it was the extension 0xc008, aka TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA (that I found here. So I added it to the TLS_CIPHER_NAME_MAP object, and then it was fine.

2. Then, when comparing the grease option, I notice that there is a difference of code, at every time it appears, and sometimes the codes are changing (check the screenshots below):

Capture d’écran 2024-07-20 à 19 50 59 Capture d’écran 2024-07-20 à 19 51 10 Capture d’écran 2024-07-20 à 19 52 11 Capture d’écran 2024-07-20 à 19 51 57 Capture d’écran 2024-07-20 à 19 51 45 Capture d’écran 2024-07-20 à 19 51 22

3. Then I also noticed an issue with akamai's settings part (the part before the first |), which seems to not be set correctly, and ignores all what is beyond the first ,:

Capture d’écran 2024-07-20 à 19 54 00

This results in an alteration of the settings frame sent:

Capture d’écran 2024-07-20 à 19 55 03

That's pretty much all from my tests, thank you very much for all of this already, this is going to be very very useful !

novitae commented 4 months ago

Here are both of the json files I am comparing in the screenshots, if you want them: resp.json safari.json

perklet commented 4 months ago
  1. These are some old ciphers that original BoringSSL does not support, I forgot to add there values on python side.
  2. GREASEs are expected to change, the "R" actually means random.
  3. We use semicolons to seperate h2 settings values, not commas. e.g. 1:65536;2:0;4:6291456;6:262144.
novitae commented 4 months ago

Thank you for the explanations !

perklet commented 4 months ago

Let's keep this open before I add the missing values in TLS_CIPHER_NAME_MAP.