curl / curl

A command line tool and library for transferring data with URL syntax, supporting DICT, FILE, FTP, FTPS, GOPHER, GOPHERS, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET, TFTP, WS and WSS. libcurl offers a myriad of powerful features
https://curl.se/
Other
35.9k stars 6.43k forks source link

Server claims gzip but sends uncompressed; curl not like #8928

Closed Siemenskun closed 2 years ago

Siemenskun commented 2 years ago

Looks like I'm able to reproduce #2368.

I did this

curl https://m2ch.cf/s/index.rss --compressed
curl: (61) Error while processing content unencoding: incorrect header check

Looks like there is a bug somewhere in Apache and/or mod_php (I use apache24-2.4.53_1 + mod_php80-8.0.18 now.) When the setting zlib.output_compression in php.ini is set to on, sometimes some large responses have Content-Encoding: gzip header, but no actual encoding is used. It's reproducible until Apache is restarted (probably that's why #2368 was non-reproducible, they just restart their server). I have restarted my server too, but I made an intentionally broken RSS-feed to reproduce it: https://m2ch.cf/s/index.rss All I did was add a Content-Encoding: gzip header to unencoded content, so I honestly don't know why it was error 23 in this comment.

I expected the following

Probably if response contains non-binary data, it should be just displayed. Because some RSS-readers use curl internally and thus crash, I think it would be better don't report this error.

curl/libcurl version

curl 7.83.1 (amd64-portbld-freebsd12.3) libcurl/7.83.1 OpenSSL/1.1.1l zlib/1.2.11 libpsl/0.21.1 (+libidn2/2.3.2) libssh2/1.10.0 nghttp2/1.47.0 Release-Date: 2022-05-11 Protocols: dict file ftp ftps gopher gophers http https imap imaps pop3 pop3s rtsp scp sftp smtp smtps telnet tftp Features: alt-svc AsynchDNS GSS-API HSTS HTTP2 HTTPS-proxy IPv6 Kerberos Largefile libz NTLM NTLM_WB PSL SPNEGO SSL TLS-SRP UnixSockets

operating system

All

bagder commented 2 years ago

I just can't figure out how libcurl would magically know that the error it gets when trying to decompress is actually already uncompressed data since it has no idea of what data it transfers?

jadijadi commented 2 years ago

curl behaviour looks fine. We have < Content-Encoding: gzip in header but the content can not be gunzipped. IMO this should be reported as a bug to the server side program (php/apache/...)

Siemenskun commented 2 years ago

how libcurl would magically know that the error it gets when trying to decompress is actually already uncompressed data since it has no idea of what data it transfers

AFAIK there's some heuristic used for detecting binary payload and suggesting don't display it in a terminal. I thought it can be re-used here?..

I agree it isn't a curl bug, nevertheless in my opinion it would be nice to have a workaround for this scenario.

bagder commented 2 years ago

AFAIK there's some heuristic used for detecting binary payload and suggesting don't display it in a terminal. I thought it can be re-used here?..

Just because it isn't compressed doesn't mean it can't be binary. I don't understand how that (very basic) detection could work in this case.

it would be nice to have a workaround for this scenario

If libcurl would have such a workaround it needs to be fairly fail-proof and I don't know of any such.

The best work-around if you ask me, is that the application tries the request again without auto-decompression.

bagder commented 2 years ago

This is curl working as expected when the server lies to the client.

xingya1822 commented 2 months ago

how libcurl would magically know that the error it gets when trying to decompress is actually already uncompressed data since it has no idea of what data it transfers

AFAIK there's some heuristic used for detecting binary payload and suggesting don't display it in a terminal. I thought it can be re-used here?..

I agree it isn't a curl bug, nevertheless in my opinion it would be nice to have a workaround for this scenario.

there was a workaround --raw