daurnimator / lua-http

HTTP Library for Lua. Supports HTTP(S) 1.0, 1.1 and 2.0; client and server.
https://daurnimator.github.io/lua-http/
MIT License
795 stars 81 forks source link

HTTP 2: invalid state progression ('closed' to 'closed') #145

Closed pspacek closed 5 years ago

pspacek commented 5 years ago

Hello,

I've noticed that /sometimes/ lua-http server throws this error:

error: /usr/share/lua/5.1/http/h2_stream.lua:175: invalid state progression ('closed' to 'closed') stack traceback:
    [C]: in function 'error'
    /usr/share/lua/5.1/http/h2_stream.lua:175: in function 'set_state'
    /usr/share/lua/5.1/http/h2_stream.lua:702: in function 'handler'
    /usr/share/lua/5.1/http/h2_connection.lua:219: in function 'handle_frame'
    /usr/share/lua/5.1/http/h2_connection.lua:260: in function 'step'
    /usr/share/lua/5.1/http/h2_connection.lua:342: in function 'get_next_incoming_stream'
    /usr/share/lua/5.1/http/server.lua:155: in function </usr/share/lua/5.1/http/server.lua:132>

I'm going to attach PCAP file with the traffic so you can see what caused this error.

pspacek commented 5 years ago

PCAP + TLS keys for decrypting the HTTPS traffic: closed_closed.zip

Client is Firefox 66.

You can decrypt the HTTPS traffic on port 5380 using Wireshark configuration described here: Edit→Preferences→Protocols→SSL→(Pre)-Master-Secret log filename. https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS/Key_Log_Format

pspacek commented 5 years ago

Okay, so now once again and with timestamps to make it easier to decode:

15:42:31.164963 [worker.background] error: /usr/share/lua/5.1/http/h2_stream.lua:175: invalid state progression ('closed' to 'closed') stack traceback:
15:42:31.165029     [C]: in function 'error'
15:42:31.165049     /usr/share/lua/5.1/http/h2_stream.lua:175: in function 'set_state'
15:42:31.165067     /usr/share/lua/5.1/http/h2_stream.lua:702: in function 'handler'
15:42:31.165085     /usr/share/lua/5.1/http/h2_connection.lua:219: in function 'handle_frame'
15:42:31.165101     /usr/share/lua/5.1/http/h2_connection.lua:260: in function 'step'
15:42:31.165120     /usr/share/lua/5.1/http/h2_connection.lua:342: in function 'get_next_incoming_stream'
15:42:31.165135     /usr/share/lua/5.1/http/server.lua:155: in function </usr/share/lua/5.1/http/server.lua:132>

I.e. the error is reported at 15:42:31.164963. This pcap_with_latency.zip has additional delay 100 ms on each roundtrip to make it easier to correlate log lines with packets.

At first glance it seems that HTTP/2 message RST_STREAM is triggering this.

pspacek commented 5 years ago

And here is screenshot of Wireshark so you do not need to decode it yourself: wireshark_screenshot

The highlighted packet is immediately preceding error in log.

pspacek commented 5 years ago

Okay, now I'm convinced this is bug/omission in RST_STREAM frame handler triggered by race condition. The main problem is that RST_STREAM handler does not take into account network delay and/or out-of-order processing.

Network packets as seen on network are send and received in this order:

in  -> HEADERS[17]: GET /favicon.ico, WINDOW_UPDATE[17]
in  -> RST_STREAM[17]
out <- HEADERS[17]: 404 Not Found
out <- DATA[17]

But in fact cqueues/lua-http is processing frames in different order:

process input frame stream 17 frame HEADERS: state idle -> half closed (remote)
process input frame stream 17 frame WINDOW_UPDATE
output stream 17 frame HEADERS
output stream 17 frame DATA: state half closed (remote) -> closed
process input frame stream 17 frame RST_STREAM: already closed -> throw error

IMHO normal network latency can cause that legitimate RST_STREAM frame will be received after answer was already sent, so receiving RST_STREAM on closed stream should be ignored.