Open tmolitor-stud-tu opened 3 years ago
Thanks, this indeed sounds like a bug. I don't suppose you could up with a self-contained test case?
Well, the whole thing would need a h2 server behaving oike apple's push endpoint. Can a lua-http server programmatically send out goaway frames? A testcase would be such a server and a lua-http client sending some requsts, waiting for several minutes, then sending a new request which will be answered by the server with a goaway (but without closing the tcp connection, I guess).
Other idea: you tell me what debug output to put where and I'll add this to our push server for Monal (we are using a test server for the alpha builds of Monal which is ideal for such tests).
Other observation: This bug seems to be in all versions (0.1 to 0.3) of lua-http.
Well, the whole thing would need a h2 server behaving oike apple's push endpoint. Can a lua-http server programmatically send out goaway frames?
yes. The test suite has several such cases. see e.g. https://github.com/daurnimator/lua-http/blob/master/spec/h2_stream_spec.lua
A testcase would be such a server and a lua-http client sending some requsts, waiting for several minutes, then sending a new request which will be answered by the server with a goaway (but without closing the tcp connection, I guess).
no need to wait the 15 minutes. It sounds like you could:
- assert that client doesn't get stuck in loop. maybe starting a new stream should throw an error/return nil, err?
Isn't that already in the code (I think I found something like this already when I read the code)
Hi there, I wrote a small standalone demo to reproduce the issue. The client connection will spin forever when trying to receive headers from a server that's sent a GOAWAY frame.
I'm not 100% sure if this is what Apple's server does as mentioned in the notes, but I think it is? I see similar behavior when connecting to nginx - if I connect, then sleep past the default nginx timeout, I can open a stream, write headers, but then loop forever when trying to read headers.
local h2_connection = require "http.h2_connection"
local h2_error = require "http.h2_error"
local new_headers = require "http.headers".new
local cqueues = require'cqueues'
local ca = require "cqueues.auxlib"
local cs = require "cqueues.socket"
local cc = require "cqueues.condition"
local cq = cqueues.new()
local cond = cc.new()
local s, c = ca.assert(cs.pair())
s = assert(h2_connection.new(s,"server"))
c = assert(h2_connection.new(c,"client"))
local req_headers = new_headers()
req_headers:append(":method", "GET")
req_headers:append(":scheme", "http")
req_headers:append(":path", "/")
cq:wrap(function()
assert(cond:wait())
local stream = c:new_stream()
assert(stream:write_headers(req_headers,true))
-- this will block foerver
local res_headers = stream:get_headers(10)
end)
cq:wrap(function()
-- just to ensure other coroutine waits first
cqueues.sleep(0.1)
assert(s:write_goaway_frame(nil,h2_error.errors.INTERNAL_ERROR.code))
cond:signal(1)
end)
assert(cq:loop())
Hi there, sorry for the double-update - I wasn't 100% sure if that standalone demo was violating any specs or anything (I'm not super familiar with the details for http2), so I also re-created the case using nginx. I figure if I can recreate it using a web server, I'm probably not violating the server-side protocols/specs.
So, here's a minimal nginx config, it has a 1s timeout to receive client headers:
daemon off;
worker_processes 1;
events {
worker_connections 128;
}
pid nginx.pid;
error_log stderr debug;
http {
access_log off;
client_header_timeout 1s;
http2_recv_timeout 1s; # deprecated, nginx >= 1.19.7 uses client_header_timeout
server {
listen 127.0.0.1:8080 http2;
return 404;
}
}
and an example program that connects, sleeps for 2 seconds, then tries to perform the request.
local cqueues = require'cqueues'
local http_client = require'http.client'
local http_headers = require'http.headers'
local headers = http_headers.new()
headers:append(':method','GET')
headers:append(':path','/')
headers:append(':scheme','http')
local connection = assert(http_client.connect({
host = '127.0.0.1',
port = 8080,
version = 2,
}))
cqueues.sleep(2)
local stream = assert(connection:new_stream())
assert(stream:write_headers(headers,true))
local res_headers = assert(stream:get_headers(10));
for k,v in res_headers:each() do
print(k,v)
end
While the client is sleeping, nginx sends a goaway frame. write_headers
reports a success, then get_headers
loops forever.
I'm using lua-http to talk to Apples's push servers. My code can be found here: https://github.com/tmolitor-stud-tu/mod_push_appserver/blob/master/mod_push_appserver_apns/mod_push_appserver_apns.lua#L98
When the
HTTP/2
connection to Apple is idle for a long time (> ~15 minutes), sending a new request(R1)
makes Apple to respond with aGOAWAY
frame having the id of this new request. This creates an endlesscqueues
loop withcqueues
returning new timeouts of0.0 seconds
in an endless loop. Only testingconnection.recv_goaway_lowest
when attempting the next(R2)
push and closing the connection if this is set, will stop the endless loop. This could be several minutes after(R1)
. Only at this point does the call tostream:get_headers(1)
for(R1)
return with an errorread: Transport endpoint is not connected
and the loop stops.Sidenote: I had to change the cqueues integration of prosody to not reshedule an immediate cqueues run on zero second timers without running the prosody mainloop at least once, to see this happen. Using the vanilla prosody integration will result in an unresponsive server spinning forever.
Expected behaviour: Receiving a
GOAWAY
frame should make running requests waiting instream:get_headers()
,stream:get_body_as_string()
etc. having a biggerh2 stream id
than theGOAWAY
return an error instead of spinning forever.