Open Shnatsel opened 3 years ago
Of the original 38 on this list, the majority now work. These still show the behavior.
http://adorama.com -- see https://github.com/algesten/ureq/pull/454
http://extradom.pl -- curl: (18) transfer closed with outstanding read data remaining
http://leisurepro.com -- see https://github.com/algesten/ureq/pull/454
http://nemaweb.org curl: (18) transfer closed with outstanding read data remaining
http://sfpnet.fr curl: (18) transfer closed with outstanding read data remaining
http://sunnysports.com -- see https://github.com/algesten/ureq/pull/454
(I use curl --http1.1 -L --trace ./trace.txt http://sunnysports.com
)
I've identified one scenario we could handle better (broken server). See https://github.com/algesten/ureq/pull/454
I haven't checked all of these, but I just checked https://www.adorama.com/, and it's not a case of missing the \r\n
at the end of the message, it's a case of trailers, which we don't yet handle correctly:
(echo -e 'GET / HTTP/1.1\r\nHost: www.adorama.com\r\n' ;sleep 10) |openssl s_client -connect www.adorama.com:443 | xxd
...
000025e0: 203c 2f62 6f64 793e 3c2f 6874 6d6c 3e0d </body></html>.
000025f0: 0a30 0d0a 7365 7276 6572 2d74 696d 696e .0..server-timin
00002600: 673a 2072 7474 3b20 6475 723d 3233 2e34 g: rtt; dur=23.4
00002610: 3731 2c20 7265 7472 616e 733b 2064 7572 71, retrans; dur
00002620: 3d30 2c20 7472 6169 6c65 722d 7469 6d65 =0, trailer-time
00002630: 7374 616d 703b 2064 7572 3d31 3637 3036 stamp; dur=16706
00002640: 3330 3331 3231 3530 0d0a 0d0a 636c 6f73 30312150....clos
00002650: 6564 0a
I think rather than land #454 we should implement trailers and see if that fixes all of these cases.
Alright
Of the above examples, I think only http://sfpnet.fr/ shows the behavior it did when this Shnatsel ran this analysis. I think this one have garbage at the end.
I'm quite certain we did not see trailing headers in the above before.
0001e8d0: 3c2f 6874 6d6c 3e0a 0d0a 7265 6164 2052 </html>...read R
0001e8e0: 2042 4c4f 434b 0a72 6561 6420 5220 424c BLOCK.read R BL
0001e8f0: 4f43 4b0a 636c 6f73 6564 0a OCK.closed.
This probably needs fixing in 3.x now.
On some websites, e.g. http://banxetoyota.vn, ureq fails with the following error:
However, curl and Firefox work fine.
There's 38 such websites in the top million (I'm using Tranco list generated on the 3rd of February).
Archive with all occurrences: ureq-error-decoding-chunks.tar.gz
Code used for testing: https://github.com/Shnatsel/rust-http-clients-smoke-test/blob/f206362f2e81521bbefb84007cdd25242f6db590/ureq-smoke-test/src/main.rs