Closed Overload119 closed 2 years ago
I believe it's related to OpenSSL. I can't make net/http example to work either.
Likewise. It hangs for me with both Net::HTTP
and http.rb
FWIW, here's a user reporting something similar with Python when accessing https://sephora.fr
https://stackoverflow.com/questions/71459063/scrapy-now-timesout-on-a-website-that-used-to-work-well
They make it sound like something that changed on the remote side, and suggested it might be related to Connection: close
(but that sounds like a guess).
I get the same hang behavior with both Net::HTTP
and http.rb when attempting to make a request to https://sephora.fr
For me it's reproducible with the openssl
CLI:
$ openssl s_client -connect www.sephora.com:443 130 ↵
CONNECTED(00000005)
depth=2 C = US, O = DigiCert Inc, OU = www.digicert.com, CN = DigiCert Global Root CA
verify return:1
depth=1 C = US, O = DigiCert Inc, CN = DigiCert TLS RSA SHA256 2020 CA1
verify return:1
depth=0 C = US, ST = California, L = San Francisco, O = "Sephora USA, Inc.", CN = *.sephora.com
verify return:1
---
Certificate chain
0 s:/C=US/ST=California/L=San Francisco/O=Sephora USA, Inc./CN=*.sephora.com
i:/C=US/O=DigiCert Inc/CN=DigiCert TLS RSA SHA256 2020 CA1
1 s:/C=US/O=DigiCert Inc/CN=DigiCert TLS RSA SHA256 2020 CA1
i:/C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert Global Root CA
---
Server certificate
-----BEGIN CERTIFICATE-----
MIIGxTCCBa2gAwIBAgIQDEZ5kGHOg/rtqX+YhmjfLjANBgkqhkiG9w0BAQsFADBP
MQswCQYDVQQGEwJVUzEVMBMGA1UEChMMRGlnaUNlcnQgSW5jMSkwJwYDVQQDEyBE
aWdpQ2VydCBUTFMgUlNBIFNIQTI1NiAyMDIwIENBMTAeFw0yMjAzMDgwMDAwMDBa
Fw0yMzAzMDgyMzU5NTlaMG4xCzAJBgNVBAYTAlVTMRMwEQYDVQQIEwpDYWxpZm9y
bmlhMRYwFAYDVQQHEw1TYW4gRnJhbmNpc2NvMRowGAYDVQQKExFTZXBob3JhIFVT
QSwgSW5jLjEWMBQGA1UEAwwNKi5zZXBob3JhLmNvbTCCASIwDQYJKoZIhvcNAQEB
BQADggEPADCCAQoCggEBAMi02zgEkla4te/hNDQ0tCmPf+54B9CCh6yXE9VS4CVV
voB1kbo41KqPEWuxzEGLuziRgP/aWiDWoZHR2v/WKI+Lut8N7xBuSRg7e74QHH1v
cozoIUa339oRZmUJW87J6lFnFZh2CMPHYKQW6cgz3tnlDDTvSpS2BMfjQ+7DrWhe
PxPxLsP4OTDZF8PQjnlJ3n4XbVDO93UFOGe3h/e9nAxLXSuYFov9CneCg+aZb5GE
V9wM9abymvgo8iEH1gVBMnpZv/ZgfQgZv7mhtN7i5MSvA4cMFSGfgYd7/Qsj4ggg
1p1+vYX59TBvCJIz2rjPWeCOimZNxgDfXjI+68Gp6VMCAwEAAaOCA3wwggN4MB8G
A1UdIwQYMBaAFLdrouqoqoSMeeq02g+YssWVdrn0MB0GA1UdDgQWBBS6Wm8V9Xhe
jvEZ8ID/d7MZjyazmjAlBgNVHREEHjAcgg0qLnNlcGhvcmEuY29tggtzZXBob3Jh
LmNvbTAOBgNVHQ8BAf8EBAMCBaAwHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsGAQUF
BwMCMIGPBgNVHR8EgYcwgYQwQKA+oDyGOmh0dHA6Ly9jcmwzLmRpZ2ljZXJ0LmNv
bS9EaWdpQ2VydFRMU1JTQVNIQTI1NjIwMjBDQTEtMi5jcmwwQKA+oDyGOmh0dHA6
Ly9jcmw0LmRpZ2ljZXJ0LmNvbS9EaWdpQ2VydFRMU1JTQVNIQTI1NjIwMjBDQTEt
Mi5jcmwwPgYDVR0gBDcwNTAzBgZngQwBAgIwKTAnBggrBgEFBQcCARYbaHR0cDov
L3d3dy5kaWdpY2VydC5jb20vQ1BTMH0GCCsGAQUFBwEBBHEwbzAkBggrBgEFBQcw
AYYYaHR0cDovL29jc3AuZGlnaWNlcnQuY29tMEcGCCsGAQUFBzAChjtodHRwOi8v
Y2FjZXJ0cy5kaWdpY2VydC5jb20vRGlnaUNlcnRUTFNSU0FTSEEyNTYyMDIwQ0Ex
LmNydDAMBgNVHRMBAf8EAjAAMIIBfwYKKwYBBAHWeQIEAgSCAW8EggFrAWkAdgDo
PtDaPvUGNTLnVyi8iWvJA9PL0RFr7Otp4Xd9bQa9bgAAAX9reZUsAAAEAwBHMEUC
IQCWOLNpkNQk3kleg4XmYg2Gleq/NIRxRPjH030Pdt7xFgIgMlwvaABB79cbIc7n
t3FAEMmC48+FWatC/kzds0hn2OYAdgA1zxkbv7FsV78PrUxtQsu7ticgJlHqP+Eq
76gDwzvWTAAAAX9reZVIAAAEAwBHMEUCIQDP3bHQAHsY5gYswsr18yPbYLE2gBA4
9uqys0k3j01NfAIgQkTRsFC0rn8xBK13STYpm/XxwU9j5WxO/BF07yxxBE4AdwCz
c3cH4YRQ+GOG1gWp3BEJSnktsWcMC4fc8AMOeTalmgAAAX9reZV7AAAEAwBIMEYC
IQCcTn3aluRrG9SWcaDSoS6mKDSrGvAFXy/Gaoqj1t5UTQIhAKbmHhPQot7p7cEf
u1rsap5+1mr3/lRWmyZhfQ3nEhbmMA0GCSqGSIb3DQEBCwUAA4IBAQC0XA/BOBfj
RIJ1s4EGLfUk3DmD2NV3V2IP65L7IqeKGQTsKzkSpPd9kGLSK26N1D9TAe+3BDfp
axSw/j1c/Il1MI6zWEaI15YMQVaWrbQm5wrPUfbCmlQHz0vsrOzNS14hugz4q7uH
vQPWTPmVUq5NjOpYW8Myks9/b4UIz8CUTg1QL6MwmTszMkZooFkzYAEVS5xrf05E
FFs2Q7h4yeKtg1WEDF1vNDDcI3EmU7/6a6TzEM+d7OdYMORArOSX68R7qMmR9vFi
qrL5x5/s1MocTjlfFcvIy/nNLNqVgMg7PFcXMQkO6rhOvUXV8Bvhp2BvErV4sktP
sTg7wMV1KxRO
-----END CERTIFICATE-----
subject=/C=US/ST=California/L=San Francisco/O=Sephora USA, Inc./CN=*.sephora.com
issuer=/C=US/O=DigiCert Inc/CN=DigiCert TLS RSA SHA256 2020 CA1
---
No client certificate CA names sent
Server Temp Key: ECDH, P-256, 256 bits
---
SSL handshake has read 3648 bytes and written 314 bytes
---
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-CHACHA20-POLY1305
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
Protocol : TLSv1.2
Cipher : ECDHE-RSA-CHACHA20-POLY1305
Session-ID: DB3AA662176396813001F451A8C4C131CDDD6B922D0EBBD1C5C8B7D80A4B12AB
Session-ID-ctx:
Master-Key: 250188875E83EB4B39D46AF40C77C547C292C5EF1F9E78965443ABA117A3778318110290DDED978CD2295E45B6037EB3
TLS session ticket lifetime hint: 83100 (seconds)
TLS session ticket:
0000 - 00 00 0d f0 20 12 3b a0-a7 ed e3 57 b4 ae f8 d6 .... .;....W....
0010 - 42 b2 61 2d a6 e2 85 6a-af e0 42 f1 19 73 3c 7d B.a-...j..B..s<}
0020 - 71 35 df 97 c4 36 44 ed-c2 77 b7 5d 94 49 24 62 q5...6D..w.].I$b
0030 - 5b 9b 2d 37 70 ee e8 02-a6 9b 03 a1 99 5a 98 95 [.-7p........Z..
0040 - 7d 43 39 1a 36 5d c4 5b-ba 61 75 e6 82 6f 54 e9 }C9.6].[.au..oT.
0050 - 04 35 86 5b fb ef 9f 57-82 75 ba f7 e5 74 04 2a .5.[...W.u...t.*
0060 - 3e fb 3a e7 11 b9 af 5f-f3 bb 63 a2 75 d4 b4 68 >.:...._..c.u..h
0070 - 56 2f 4d b2 80 cf 4e 59-df 22 a9 d0 c3 e0 bb 54 V/M...NY.".....T
0080 - 66 3d 46 d8 e6 f3 59 43-b1 66 7e 96 31 d8 87 4e f=F...YC.f~.1..N
0090 - 5c 28 04 cb f5 b6 ec 72-c9 a6 57 21 be 0b 4f 47 \(.....r..W!..OG
Start Time: 1655756843
Timeout : 7200 (sec)
Verify return code: 0 (ok)
---
GET / HTTP/1.1
Host: www.sephora.com:443
Connection: close
User-Agent: http.rb/5.1.0
...hangs indefinitely.
Based on that I'm going to close this as being what appears to be a problem with OpenSSL (or possibly an interaction between OpenSSL and the remote TLS stack).
Please reopen if you can provide a reproduction that narrows this down to http.rb
SSL Labs wasn't able to make an HTTP request either:
https://www.ssllabs.com/ssltest/analyze.html?d=www.sephora.com#httpRequests
Definitely seems like an issue with that site.
@tarcieri one thing makes me wonder though: curl example works perfectly fine, firefox opend that URL without any issues, and httpie too.
If it can be reproduced with the openssl
CLI, Python, and SSL Labs it is clearly not an http.rb issue.
I'm not sure why curl
works as it's ostensibly using OpenSSL as well. Chrome and Firefox work but do not use OpenSSL.
I was able to fix it though :rofl: All we need to do is to ensure we send Connection: keep-alive
header. We do that with HTTP.persistent
, but we send it as Keep-Alive
which is good for the majority of servers, but not this one… Here's working example:
require "bundler/inline"
gemfile do
source "https://rubygems.org"
gem "http"
end
module HTTP
class Connection
KEEP_ALIVE = "keep-alive"
end
end
HTTP.persistent("https://www.sephora.com") do |http|
puts http
.use(:auto_inflate)
.headers({ "Accept-Encoding" => "gzip, deflate" })
.get("https://www.sephora.com/")
end
UPDATE: Somehow now I can make it to work without any changes... Just using persistent HTTP:
HTTP.persistent("https://www.sephora.com") do |http|
puts http.get("https://www.sephora.com/")
end
This is definitely an issue with sephora's backend server. They are using istio-envoi that seems like doing lots of weird stuff. At first, I've been able to make it work with ensuring that we send Connection: keep-alive
(lowercase was important) and Accept-Encoding
headers, now it works without any patches and without any headers...
LOL. Here's some more details. It seems like they are doing user-agent based rollout. So if you pass user-agent as Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.84 Safari/537.36
then you also must ensure that Connection: keep-alive
(case sensitive) and Accept
and Accept-Encoding
headers are present:
module HTTP
class Connection
KEEP_ALIVE = "keep-alive"
end
end
HTTP.persistent("https://www.sephora.com") do |http|
puts http
.use(:auto_inflate)
.headers({
"Accept" => "*/*",
"Accept-Encoding" => "gzip, deflate",
"User-Agent" => "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.84 Safari/537.36"
})
.get("https://www.sephora.com/")
end
If you don't pass User-Agent
, thus it will be http.rb/5.0.1
, then neither accept-encoding nor accept headers are needed:
HTTP.persistent("https://www.sephora.com") do |http|
puts http.get("https://www.sephora.com/")
end
Try Sephora.
Tried this on 5.0 and 4.4:
Same thing with Curl:
Same thing with HTTP.rb: