Closed adrianroe closed 5 years ago
I've tested the exact same connection with e.g. curl and all responses (including connection close with HTTPS) contain the full data...
I'll take a look to see if it is anything obvious, but any thoughts warmly appreciated!
Can't reproduce. What Erlang version is this?
The current repro is on Erlang 19 - I'll pull it to an Erlang 20 server as well.
The issue is definitely timing related - I can only reliably recreate it when I have a good internet connection (but not localhost!). The current repro is on an AWS server.
I have also narrowed the issue down to the interaction with Ranch. If in gun.erl, loop/1
I change
Transport:setopts(Socket, [{active, once}]),
to Transport:setopts(Socket, [{active, true}]),
I no longer see the issue. That's obviously not a change that can just be made without thought!
Would be useful to trace what the Gun process is doing. If the Transport:setopts call leads to a socket close then it means the problem is higher up the chain, perhaps in the SSL application. I think there was an issue recently about close events superseding any lingering data, I'll try to look it up later.
To trace:
dbg:start().
dbg:tracer().
dbg:tpl(gun, []).
dbg:p(all, c).
Just tried it locally (OSX - on both OTP 19 and 20) and don't see the issue. I'm sure that it's because of the slowness of my internet connection!
I'll see if I can get a local only repro (gun -> cowboy) - or I can probably get you access to a cloud server where is can be repro'd
Run the commands I gave on the node with the issue (not via remote shell) and I'll be able to have a clearer idea of the issue.
Right but it terminates the node too early, please don't run the test in an -eval
, use a proper shell instead.
...and it does look like there is a related ssl issue http://erlang.org/doc/apps/ssl/notes.html
1.2 SSL 8.2.3 Fixed Bugs and Malfunctions Packet options cannot be supported for unreliable transports, that is, packet option for DTLS over udp will not be supported.
Own Id: OTP-14664
Ensure data delivery before close if possible. This fix is related to fix in PR-1479.
Own Id: OTP-14794
My repro box is running SSL 8.1.1 - I can't install 20 on that box, but might create a new server tomorrow with latest on it...
Yeah that's what I would guess happens. Relevant ticket is https://bugs.erlang.org/browse/ERL-420
And sorry about that, should have thought about it earlier, but a more interesting trace would be:
dbg:start().
dbg:tracer().
dbg:tpl(gun, []).
dbg:tpl(gun_http, []).
dbg:p(all, c).
trace3.txt Thanks for the help!
Sounds like the Gun version is also old. What version is this?
Head
There's a gun_http:send_data_if_alive/1
call that doesn't exist, and then it immediately calls gun:connect
without going through retry_loop
? Really weird.
Since I've not been able to reproduce and it's been a while, and there's been a number of related ssl bugs fixed in recent Erlang/OTP versions, please try with the most recent version and reopen if there's still an issue. Thanks!
For anyone watching this ticket - the symptoms of this ticket still persist - still caused by a (now different) issue in the Erlang SSL library. See https://bugs.erlang.org/browse/ERL-371 for details.
We have experimented with the master branch referred to in ERL-371 which does indeed seem to prevent the issue.
The combination of "connection close" and HTTPS causes data loss or
{error,{closed,"The connection was lost."}}
There is a very simple repro in this gist
The gist pulls a file from a CDN that is available over HTTP and HTTPS, optionally setting the connection close header. Of the 4 combinations, all work as expected other than the combination of HTTPS and connection close... This almost always fails with the below, although it can also simply return less data that you would expect (i.e. a good response, but with the data truncated before the end)