Endless loop on GOAWAY - Githubissues

tmolitor-stud-tu commented 3 years ago

I'm using lua-http to talk to Apples's push servers. My code can be found here: https://github.com/tmolitor-stud-tu/mod_push_appserver/blob/master/mod_push_appserver_apns/mod_push_appserver_apns.lua#L98

When the HTTP/2 connection to Apple is idle for a long time (> ~15 minutes), sending a new request (R1) makes Apple to respond with a GOAWAY frame having the id of this new request. This creates an endless cqueues loop with cqueues returning new timeouts of 0.0 seconds in an endless loop. Only testing connection.recv_goaway_lowest when attempting the next (R2) push and closing the connection if this is set, will stop the endless loop. This could be several minutes after (R1). Only at this point does the call to stream:get_headers(1) for (R1) return with an error read: Transport endpoint is not connected and the loop stops.

Sidenote: I had to change the cqueues integration of prosody to not reshedule an immediate cqueues run on zero second timers without running the prosody mainloop at least once, to see this happen. Using the vanilla prosody integration will result in an unresponsive server spinning forever.

Expected behaviour: Receiving a GOAWAY frame should make running requests waiting in stream:get_headers(), stream:get_body_as_string() etc. having a bigger h2 stream id than the GOAWAY return an error instead of spinning forever.

daurnimator commented 3 years ago

Thanks, this indeed sounds like a bug. I don't suppose you could up with a self-contained test case?

tmolitor-stud-tu commented 3 years ago

Well, the whole thing would need a h2 server behaving oike apple's push endpoint. Can a lua-http server programmatically send out goaway frames? A testcase would be such a server and a lua-http client sending some requsts, waiting for several minutes, then sending a new request which will be answered by the server with a goaway (but without closing the tcp connection, I guess).

Other idea: you tell me what debug output to put where and I'll add this to our push server for Monal (we are using a test server for the alpha builds of Monal which is ideal for such tests).

Other observation: This bug seems to be in all versions (0.1 to 0.3) of lua-http.

daurnimator commented 3 years ago

Well, the whole thing would need a h2 server behaving oike apple's push endpoint. Can a lua-http server programmatically send out goaway frames?

yes. The test suite has several such cases. see e.g. https://github.com/daurnimator/lua-http/blob/master/spec/h2_stream_spec.lua

A testcase would be such a server and a lua-http client sending some requsts, waiting for several minutes, then sending a new request which will be answered by the server with a goaway (but without closing the tcp connection, I guess).

no need to wait the 15 minutes. It sounds like you could:

start a new stream from client
send a goaway from server
assert that client doesn't get stuck in loop. maybe starting a new stream should throw an error/return nil, err?

tmolitor-stud-tu commented 3 years ago

assert that client doesn't get stuck in loop. maybe starting a new stream should throw an error/return nil, err?

Isn't that already in the code (I think I found something like this already when I read the code)

jprjr commented 2 years ago

Hi there, I wrote a small standalone demo to reproduce the issue. The client connection will spin forever when trying to receive headers from a server that's sent a GOAWAY frame.

I'm not 100% sure if this is what Apple's server does as mentioned in the notes, but I think it is? I see similar behavior when connecting to nginx - if I connect, then sleep past the default nginx timeout, I can open a stream, write headers, but then loop forever when trying to read headers.

local h2_connection = require "http.h2_connection"
local h2_error = require "http.h2_error"
local new_headers = require "http.headers".new
local cqueues = require'cqueues'
local ca = require "cqueues.auxlib"
local cs = require "cqueues.socket"
local cc = require "cqueues.condition"

local cq = cqueues.new()
local cond = cc.new()

local s, c = ca.assert(cs.pair())
s = assert(h2_connection.new(s,"server"))
c = assert(h2_connection.new(c,"client"))

local req_headers = new_headers()
req_headers:append(":method", "GET")
req_headers:append(":scheme", "http")
req_headers:append(":path", "/")

cq:wrap(function()
  assert(cond:wait())
  local stream = c:new_stream()
  assert(stream:write_headers(req_headers,true))
  -- this will block foerver
  local res_headers = stream:get_headers(10)
end)

cq:wrap(function()
  -- just to ensure other coroutine waits first
  cqueues.sleep(0.1)
  assert(s:write_goaway_frame(nil,h2_error.errors.INTERNAL_ERROR.code))
  cond:signal(1)
end)

assert(cq:loop())

jprjr commented 2 years ago

Hi there, sorry for the double-update - I wasn't 100% sure if that standalone demo was violating any specs or anything (I'm not super familiar with the details for http2), so I also re-created the case using nginx. I figure if I can recreate it using a web server, I'm probably not violating the server-side protocols/specs.

So, here's a minimal nginx config, it has a 1s timeout to receive client headers:

daemon off;
worker_processes 1;

events {
    worker_connections 128;
}

pid nginx.pid;

error_log stderr debug;

http {
    access_log off;

    client_header_timeout 1s;
    http2_recv_timeout 1s; # deprecated, nginx >= 1.19.7 uses client_header_timeout

    server {
        listen 127.0.0.1:8080 http2;
        return 404;
    }
}

and an example program that connects, sleeps for 2 seconds, then tries to perform the request.

local cqueues = require'cqueues'
local http_client = require'http.client'
local http_headers = require'http.headers'

local headers = http_headers.new()
headers:append(':method','GET')
headers:append(':path','/')
headers:append(':scheme','http')

local connection = assert(http_client.connect({
  host = '127.0.0.1',
  port = 8080,
  version = 2,
}))

cqueues.sleep(2)

local stream = assert(connection:new_stream())
assert(stream:write_headers(headers,true))
local res_headers = assert(stream:get_headers(10));
for k,v in res_headers:each() do
  print(k,v)
end

While the client is sleeping, nginx sends a goaway frame. write_headers reports a success, then get_headers loops forever.

daurnimator / lua-http

Endless loop on GOAWAY #179