lwthiker / curl-impersonate

curl-impersonate: A special build of curl that can impersonate Chrome & Firefox
MIT License
3.69k stars 245 forks source link

Impersonate http header order fingerprint #149

Closed perklet closed 1 year ago

perklet commented 1 year ago

Except for http2 fingerprinting, http header orders can be used for fingerprinting as well.

According to Amazon Cloudfront docs:

Amazon CloudFront now supports the “Cloudfront-viewer-header-order” and "Cloudfront-viewer-header-count" headers, enabling customers to track the total number of HTTP headers sent with each request, as well as the order in which the headers were sent. Customers can use the two headers to detect and identify request patterns and compare them to the expected and legitimate patterns. This, used in conjunction with other access control rules, can help customers detect and block any attempts to spoof requests.

Unfortunately, the order of headers is fixed for curl, it does not send headers in the order of being added, and the order can not be changed whatever you like. It was discussed in curl/curl#3282, and has been on the TODO list for a very long time.

As you can see in the curl/lib/http.c:

    Curl_dyn_addf(&req,
                  " HTTP/%s\r\n" /* HTTP version */
                  "%s" /* host */
                  "%s" /* proxyuserpwd */
                  "%s" /* userpwd */
                  "%s" /* range */
                  "%s" /* user agent */
                  "%s" /* accept */
                  "%s" /* TE: */
                  "%s" /* accept-encoding */
                  "%s" /* referer */
                  "%s" /* Proxy-Connection */
                  "%s" /* transfer-encoding */
                  "%s",/* Alt-Used */

It seems unlikely that curl will change this. Could you please consider adding this feature to curl-impersonate? Thanks.

lwthiker commented 1 year ago

Currently with all the built-in impersonations the headers are sent in the correct order according to my testing. Are you referring to a possible case where the HTTP headers are manually set by the user and in a different order (e.g. Accept-Encoding before User-Agent? Can you share an example where this affects you?

perklet commented 1 year ago

I just checked the header order with a simple echo server, the orders are correct. I was looking at the "source" request headers in Chrome devtools, which is alphabetically ordered, and I was misled by that. Thanks for your explaination and sorry for the bother.

image
Connection from ('127.0.0.1', 57396)
GET / HTTP/1.1
Host: localhost:8888
Connection: keep-alive
sec-ch-ua: "Chromium";v="110", "Not A(Brand";v="24", "Google Chrome";v="110"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "macOS"
DNT: 1
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en,zh-CN;q=0.9,zh-TW;q=0.8,zh;q=0.7
Cookie: jenkins-timestamper-offset=-28800000

Connection from ('127.0.0.1', 61000)
GET / HTTP/1.1
Host: localhost:8888
Connection: Upgrade, HTTP2-Settings
Upgrade: h2c
HTTP2-Settings: AAEAAQAAAAIAAAAAAAMAAAPoAAQAYAAAAAYABAAA
sec-ch-ua: "Chromium";v="110", "Not A(Brand";v="24", "Google Chrome";v="110"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
jlcd commented 1 year ago

Thanks for reporting back @yifeikong, was about to start debugging this issue here as well, to avoid future detection. So I guess we're good.

SmartArray commented 1 year ago

I was recording some captures today using Wireshark.

I could observe the following behaviour of the normal curl binary:

The order of the specified header fields is respected. It seems that -H 'Host: example.com' is always the first element no matter when it is specified.

curl -XPOST 'http://google.com'  \
  -H 'Host: facebook.com' \
  -H 'Accept: application/json' \
  -H 'User-Agent: yyyy'

indeed produces

POST / HTTP/1.1
Host: facebook.com
Accept: application/json
User-Agent: yyyy

If we switch up Accept and User-Agent this is indeed respected.

curl -XPOST 'http://google.com'  \
  -H 'Host: facebook.com' \
  -H 'User-Agent: yyyy' \
  -H 'Accept: application/json'

produces:

POST / HTTP/1.1
Host: facebook.com
User-Agent: yyyy
Accept: application/json

Interestingly we can specify header fields twice:

curl -XPOST 'http://google.com'  \
  -H 'Host: facebook.com' \
  -H 'User-Agent: yyyy' \
  -H 'Accept: application/json' \
  -H 'User-Agent: xxxx' 
POST / HTTP/1.1
Host: facebook.com
User-Agent: yyyy
Accept: application/json
User-Agent: xxxx

Host always first

When we are trying to put the host header field to the last position like so:

curl -XPOST 'http://google.com'  \
  -H 'User-Agent: yyyy' \
  -H 'Accept: application/json' \
  -H 'User-Agent: xxxx' \
  -H 'Host: facebook.com'

it produces the same output as the previous command:

POST / HTTP/1.1
Host: facebook.com
User-Agent: yyyy
Accept: application/json
User-Agent: xxxx

I cannot find any section in the official HTTP RFCs that force you to use it as the first header field, but throughout my career I have never seen something different though.