Closed perklet closed 1 year ago
Currently with all the built-in impersonations the headers are sent in the correct order according to my testing. Are you referring to a possible case where the HTTP headers are manually set by the user and in a different order (e.g. Accept-Encoding
before User-Agent
? Can you share an example where this affects you?
I just checked the header order with a simple echo server, the orders are correct. I was looking at the "source" request headers in Chrome devtools, which is alphabetically ordered, and I was misled by that. Thanks for your explaination and sorry for the bother.
Connection from ('127.0.0.1', 57396)
GET / HTTP/1.1
Host: localhost:8888
Connection: keep-alive
sec-ch-ua: "Chromium";v="110", "Not A(Brand";v="24", "Google Chrome";v="110"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "macOS"
DNT: 1
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en,zh-CN;q=0.9,zh-TW;q=0.8,zh;q=0.7
Cookie: jenkins-timestamper-offset=-28800000
Connection from ('127.0.0.1', 61000)
GET / HTTP/1.1
Host: localhost:8888
Connection: Upgrade, HTTP2-Settings
Upgrade: h2c
HTTP2-Settings: AAEAAQAAAAIAAAAAAAMAAAPoAAQAYAAAAAYABAAA
sec-ch-ua: "Chromium";v="110", "Not A(Brand";v="24", "Google Chrome";v="110"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Thanks for reporting back @yifeikong, was about to start debugging this issue here as well, to avoid future detection. So I guess we're good.
I was recording some captures today using Wireshark.
curl
binary:The order of the specified header fields is respected. It seems that -H 'Host: example.com'
is always the first element no matter when it is specified.
curl -XPOST 'http://google.com' \
-H 'Host: facebook.com' \
-H 'Accept: application/json' \
-H 'User-Agent: yyyy'
indeed produces
POST / HTTP/1.1
Host: facebook.com
Accept: application/json
User-Agent: yyyy
If we switch up Accept
and User-Agent
this is indeed respected.
curl -XPOST 'http://google.com' \
-H 'Host: facebook.com' \
-H 'User-Agent: yyyy' \
-H 'Accept: application/json'
produces:
POST / HTTP/1.1
Host: facebook.com
User-Agent: yyyy
Accept: application/json
Interestingly we can specify header fields twice:
curl -XPOST 'http://google.com' \
-H 'Host: facebook.com' \
-H 'User-Agent: yyyy' \
-H 'Accept: application/json' \
-H 'User-Agent: xxxx'
POST / HTTP/1.1
Host: facebook.com
User-Agent: yyyy
Accept: application/json
User-Agent: xxxx
When we are trying to put the host header field to the last position like so:
curl -XPOST 'http://google.com' \
-H 'User-Agent: yyyy' \
-H 'Accept: application/json' \
-H 'User-Agent: xxxx' \
-H 'Host: facebook.com'
it produces the same output as the previous command:
POST / HTTP/1.1
Host: facebook.com
User-Agent: yyyy
Accept: application/json
User-Agent: xxxx
I cannot find any section in the official HTTP RFCs that force you to use it as the first header field, but throughout my career I have never seen something different though.
Except for http2 fingerprinting, http header orders can be used for fingerprinting as well.
According to Amazon Cloudfront docs:
Unfortunately, the order of headers is fixed for curl, it does not send headers in the order of being added, and the order can not be changed whatever you like. It was discussed in curl/curl#3282, and has been on the TODO list for a very long time.
As you can see in the curl/lib/http.c:
It seems unlikely that curl will change this. Could you please consider adding this feature to curl-impersonate? Thanks.