Open mmcclimon opened 3 years ago
https://www.carmax.com appears to not ever respond for HTTP/1.1 requests, it just holds the socket open.
Part of the problem is that I think HTTP::Tiny is using IO::Socket::SSL wrong.
select() on an IO::Socket::SSL fd says "There is data on the socket", but that data could just be SSL protocol level data, and not actual data, and so a subsequent sysread() will fail.
I think this code:
while () {
$nfound = ($type eq 'read')
? select($fdset, undef, undef, $pending)
: select(undef, $fdset, undef, $pending) ;
if ($nfound == -1) {
$! == EINTR
or die(qq/select(2): '$!'\n/);
redo if !$timeout || ($pending = $timeout - (time - $initial)) > 0;
$nfound = 0;
}
last;
}
$! = 0;
return $nfound;
needs to check $self->{fd}->{pending}
if $nfound lists any bytes to ensure there's actually data to read on the SSL socket before calling sysread().
But I'm not sure how to make this whole loop not busyloop because select() will keep returning true and pending() will keep returning false...
I think I see. It's this IO::Socket::SSL paragraph:
Additionally, contrary to plain sockets the data delivered on the socket are not necessarily application payload. It might be a TLS handshake, it might just be the beginning of a TLS record or it might be TLS session tickets which are send after the TLS handshake in TLS 1.3. In such situations select will return that data are available for read since it only looks at the plain socket. A sysread on the IO::Socket::SSL socket will not return any data though since it is an abstraction which only returns application data. This causes the sysread to hang in case the socket was blocking or to return an error with EAGAIN on non-blocking sockets. Applications using select or similar should therefore set the socket to non-blocking and also expect that the sysread might temporarily fail with EAGAIN.
I think the busy loop will need to check elapsed time and exit after the timeout has expired.
@mmcclimon I think you mentioned in Slack that Carmax is on Akamai. If so, this sounds like their tarpitting strategy, possibly triggered on the user-agent. See this thread.
The issue is not restricted to TLS 1.3 but is more only more likely then. With TLS 1.2 can_read
will return true with incomplete TLS records, but a latter read on the socket will hang until the TLS record got read and decrypted completely. With TLS 1.3 then there are TLS session tickets outside of the handshake which will make the problem more likely.
A fix is to _do_timeout
but then peek
to make sure that we actually have data in the buffer. Feels kind of dirty to me but is working. And it is not fully accurate since for every retry inside the loop the timeout is applied again:
sub can_read {
@_ == 1 || @_ == 2 || die(q/Usage: $handle->can_read([timeout])/ . "\n");
my $self = shift;
if ( ref($self->{fh}) eq 'IO::Socket::SSL' ) {
while (1) {
return 1 if $self->{fh}->pending;
return 0 if ! $self->_do_timeout('read', @_);
# even if _do_timeout showed activity on the socket there might be
# no actual application data but instead a TLS session ticket or an
# yet incomplete TLS record. Therefore actually check that we have
# decrypted data to read before claiming that we can read.
my $lb = $self->{fh}->blocking(0); # otherwise peek might hang
my $n = $self->{fh}->peek(my $buf,1);
$self->{fh}->blocking($lb);
return 1 if $n;
}
}
return $self->_do_timeout('read', @_)
}
Finally, I'm found a workaround with timeout issue. I have all latest versions of all Perl modules. My Perl version is 5.28.1
No timeout:
perl -MData::Dumper -MHTTP::Tiny -e 'print Dumper(HTTP::Tiny->new(timeout=>2)->head("https://www.carmax.com"))'
Timeout is OK:
perl -MData::Dumper -MHTTP::Tiny -e 'print Dumper(HTTP::Tiny->new(timeout=>2, SSL_options=>{SSL_version=>'TLSv12'})->head("https://www.carmax.com"))'
I apologize that I don't have a better reproducer for this; I tried, but it's some weird combination of things I couldn't exactly get to happen locally.
It is possible, with some kinds of network requests, for HTTP::Tiny's timeout never to fire, and a request hangs forever. I have seen this behavior with carmax.com, which appear to tarpit HTTP/1.1 requests somehow. At time of writing, it's trivial to reproduce with the following (though of course, this might change if CarMax changes). I observed it first with HEAD requests, but GET has the same behavior.
I think this is happening because the socket connects, but doesn't actually read any data (admittedly, I am not an HTTP expert). Privately, @xdg suggested that HTTP::Tiny should use something like IO::Socket::Timeout.
Locally I'm on perl 5.34.0, HTTP::Tiny 0.076, and IO::Socket::SSL 2.071, but have also seen the problem in production on 5.28.1 / 0.070 / 2.068. Interestingly, I don't see the problem on 5.8.8 / 0.056 / 2.021 or 5.24.0 / 0.056 / 2.047 (all perl, HTTP::Tiny, and IO::Socket::SSL versions, respectively). I would test more combinations to try to find the regression, but installing the SSL modules is fiddlyplus on macOS, and these are just the ones I have readily available.
Thanks!