lwthiker / curl-impersonate

curl-impersonate: A special build of curl that can impersonate Chrome & Firefox
MIT License
3.46k stars 229 forks source link

Have the headers changed a bit with Chrome 103? #89

Closed A-Posthuman closed 1 year ago

A-Posthuman commented 1 year ago

I checked my laptop's Windows Chrome 103 headers using your socat procedure, compared to the chrome101 curl-impersonate script's results I see extra headers for Connection:, Cache-Control:, and also DNT:. That last one might be due to my personal Chrome settings, not sure. Also the sec-ch-ua: string seems to be a bit different.

Chrome 103:

GET / HTTP/1.1\r
Host: xx.xx.xx.xx:8443\r
Connection: keep-alive\r
Cache-Control: max-age=0\r
sec-ch-ua: ".Not/A)Brand";v="99", "Google Chrome";v="103", "Chromium";v="103"\r
sec-ch-ua-mobile: ?0\r
sec-ch-ua-platform: "Windows"\r
DNT: 1\r
Upgrade-Insecure-Requests: 1\r
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36\r
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9\r
Sec-Fetch-Site: none\r
Sec-Fetch-Mode: navigate\r
Sec-Fetch-User: ?1\r
Sec-Fetch-Dest: document\r
Accept-Encoding: gzip, deflate, br\r
Accept-Language: en-US,en;q=0.9\r

curl-impersonate chrome101:

GET / HTTP/1.1\r
Host: localhost:8443\r
sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="101", "Google Chrome";v="101"\r
sec-ch-ua-mobile: ?0\r
sec-ch-ua-platform: "Windows"\r
Upgrade-Insecure-Requests: 1\r
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36\r
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9\r
Sec-Fetch-Site: none\r
Sec-Fetch-Mode: navigate\r
Sec-Fetch-User: ?1\r
Sec-Fetch-Dest: document\r
Accept-Encoding: gzip, deflate, br\r
Accept-Language: en-US,en;q=0.9\r

Related question: I tried setting custom headers via node-libcurl in my program, by passing an array to my .get() method's 'HTTPHEADER' option, like below, but in the resulting request socat is showing the Connection: and Cache-Control: headers end up at the end, despite my array passing them first. Is this a curl bug? Anyway to override it to get the right order like Chrome 103?

let headers = [
    'Connection: keep-alive',
    'Cache-Control: max-age=0',
    'sec-ch-ua: ".Not/A)Brand";v="99", "Google Chrome";v="103", "Chromium";v="103"',
    'sec-ch-ua-mobile: ?0',
    'sec-ch-ua-platform: "Windows"',
    'Upgrade-Insecure-Requests: 1',
    'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
    'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'Sec-Fetch-Site: none',
    'Sec-Fetch-Mode: navigate',
    'Sec-Fetch-User: ?1',
    'Sec-Fetch-Dest: document',
    'Accept-Encoding: gzip, deflate, br',
    'Accept-Language: en-US,en;q=0.9'
];
let response = await aCurly.get('https://localhost:8443/', { SSL_VERIFYPEER: false, SSL_VERIFYHOST: false, HTTPHEADER: headers });

socat result:

GET / HTTP/1.1\r
Host: localhost:8443\r
sec-ch-ua: ".Not/A)Brand";v="99", "Google Chrome";v="103", "Chromium";v="103"\r
sec-ch-ua-mobile: ?0\r
sec-ch-ua-platform: "Windows"\r
Upgrade-Insecure-Requests: 1\r
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36\r
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9\r
Sec-Fetch-Site: none\r
Sec-Fetch-Mode: navigate\r
Sec-Fetch-User: ?1\r
Sec-Fetch-Dest: document\r
Accept-Encoding: gzip, deflate, br\r
Accept-Language: en-US,en;q=0.9\r
Connection: keep-alive\r
Cache-Control: max-age=0\r

I tried editing the curl_chrome101 script to add these 2 headers in there, and same behavior: they end up at the bottom of the headers instead of top.

lwthiker commented 1 year ago

I see extra headers for Connection:, Cache-Control:, and also DNT

I just tested Chrome 103 with Wireshark and the SSLKEYLOGFILE env var set and browsed to www.wikipedia.org. I didn't get these headers. It might have something to do with the local socat or the use of HTTP/1.1 (Connection for example is not used in HTTP/2). Anyway I wouldn't worry about adding them unless it looks absolutely necessary for fetching the website you need.

Also the sec-ch-ua: string seems to be a bit different

Yes it changes every release.

the Connection: and Cache-Control: headers end up at the end, despite my array passing them first

You are correct, this is a bug with my implementation. The user-supplied headers are always added after the built-in list of headers that curl-impersonate uses. I'll open a separate issue for that.

I tried editing the curl_chrome101 script to add these 2 headers in there, and same behavior: they end up at the bottom of the headers instead of top.

I just tried that and the order looks fine, same as supplied in the script. Can you verify again?

lwthiker commented 1 year ago

Opened #90

A-Posthuman commented 1 year ago

Here is what I get when manually editing the curl_chrome101 script:

  1. sudo nano /usr/local/bin/curl_chrome101
  2. add in the header lines so the script looks like this:
#!/bin/bash

# Find the directory of this script
dir=${0%/*}

# The list of ciphers can be obtained by looking at the Client Hello message in
# Wireshark, then converting it using this reference
# https://wiki.mozilla.org/Security/Cipher_Suites
"$dir/curl-impersonate-chrome" \
    --ciphers TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY130
5_SHA256,ECDHE-ECDSA-AES128-GCM-SHA256,ECDHE-RSA-AES128-GCM-SHA256,ECDHE-ECDSA-A
ES256-GCM-SHA384,ECDHE-RSA-AES256-GCM-SHA384,ECDHE-ECDSA-CHACHA20-POLY1305,ECDHE
-RSA-CHACHA20-POLY1305,ECDHE-RSA-AES128-SHA,ECDHE-RSA-AES256-SHA,AES128-GCM-SHA256,AES256-GCM-SHA384,AES128-SHA,AES256-SHA \
    -H 'Connection: keep-alive' \
    -H 'Cache-Control: max-age=0' \
    -H 'sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="101", "Google Chrome";v="101"' \
    -H 'sec-ch-ua-mobile: ?0' \
    -H 'sec-ch-ua-platform: "Windows"' \
    -H 'Upgrade-Insecure-Requests: 1' \
    -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36' \
    -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
    -H 'Sec-Fetch-Site: none' \
    -H 'Sec-Fetch-Mode: navigate' \
    -H 'Sec-Fetch-User: ?1' \
    -H 'Sec-Fetch-Dest: document' \
    -H 'Accept-Encoding: gzip, deflate, br' \
    -H 'Accept-Language: en-US,en;q=0.9' \
    --http2 --false-start --compressed \
    --tlsv1.2 --no-npn --alps \
    --cert-compression brotli \
    "$@"
  1. run socat in one shell: socat -v openssl-listen:8443,reuseaddr,fork,cert=server.pem,verify=0 echo
  2. In another shell, connect: curl_chrome101 -k https://localhost:8443
  3. result:
> 2022/07/24 09:45:05.293892  length=689 from=0 to=688
GET / HTTP/1.1\r
Host: localhost:8443\r
sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="101", "Google Chrome";v="101"\r
sec-ch-ua-mobile: ?0\r
sec-ch-ua-platform: "Windows"\r
Upgrade-Insecure-Requests: 1\r
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,
 like Gecko) Chrome/101.0.4951.67 Safari/537.36\r
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/w
ebp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9\r
Sec-Fetch-Site: none\r
Sec-Fetch-Mode: navigate\r
Sec-Fetch-User: ?1\r
Sec-Fetch-Dest: document\r
Accept-Encoding: gzip, deflate, br\r
Accept-Language: en-US,en;q=0.9\r
Connection: keep-alive\r
Cache-Control: max-age=0\r
lwthiker commented 1 year ago

I just tried exactly the same and I get the correct order. Is it possible that you have the CURL_IMPERSONATE environment variable set as well? Because that could cause that (i.e. the bug in libcurl-impersonate pops up).

A-Posthuman commented 1 year ago

Ahh, yes you're right I had forgotten I added that env var into my .bashrc a while back. Ok that explains it.

lwthiker commented 1 year ago

Alright, I'll close this issue. We have #90 open for the libcurl-impersonate bug with the headers order.