actix / actix-web

Actix Web is a powerful, pragmatic, and extremely fast web framework for Rust.
https://actix.rs
Apache License 2.0
21.57k stars 1.67k forks source link

Intermittent hang for handler that misses a parameter when ACK with no FIN is used after HTTP 200 OK. #2123

Open staninprague opened 3 years ago

staninprague commented 3 years ago

Expected Behavior

Actix-web should be consistent in handling the handler with missing parameters.

When I have a handler like this:

        .service(
                web::resource("/handler")
                    .route(web::post().to(|| { HttpResponse::Ok().body("{}")}),
            )

I might call it with curl like:

curl -i -H 'Content-Type: application/json' 'http://192.168.1.21:8081/handler' -d '{}'

Where -d '{}' would translate to |item: web::Json<SomeTypeWithNoFields>|

One would expect actix-web to either fail or serve such a request consistently and independent on keep-alive settings.

Current Behavior

When using curl that sends ACK, FIN after request, everything is working in the consistent way - request is always properly served.

When using a client like iOS NSURLSession, several requests might be served well (1 or 2-3). Then at some point (might be second request as well) actix-web responds with ACK for such a request and then nothing happens. Client timeouts (timeout set to 60 seconds, actix-web keep alive is set to 95 seconds in my case).

What I see as a difference between curl and iOS NSURLSession client is sending ACK, FIN (curl) or ACK only (NSURLSession).

Fixing the handler to accept the expected parameter (json) of '{}' solves the issue.

All other handlers are served with no problem at the time when this particular handler with missing parameter is having a problem. My belief is that timeout from the client probably then leads to FIN and next time such a handler works ok, until the next clog.

Here are two subsequent requests from iOS NSURLSession to such a handler that is missing parameter to parse from the request body (handler = account_groups/get_principal_shared_groups):

33  3.380642    192.168.1.93    192.168.1.21    TCP 425 57061 → 8081 [PSH, ACK] Seq=1 Ack=1 Win=131712 Len=359 TSval=303677580 TSecr=2679189203 [TCP segment of a reassembled PDU]
34  3.381258    192.168.1.93    192.168.1.21    HTTP/JSON   70  POST /account_groups/get_principal_shared_groups HTTP/1.1 , JavaScript Object Notation (application/json)
35  3.382141    192.168.1.21    192.168.1.93    TCP 66  8081 → 57061 [ACK] Seq=1 Ack=360 Win=64896 Len=0 TSval=2679189206 TSecr=303677580
36  3.382413    192.168.1.21    192.168.1.93    TCP 66  8081 → 57061 [ACK] Seq=1 Ack=364 Win=64896 Len=0 TSval=2679189207 TSecr=303677580
37  3.395792    192.168.1.21    192.168.1.93    HTTP/JSON   408 HTTP/1.1 200 OK , JavaScript Object Notation (application/json)
38  3.395844    192.168.1.93    192.168.1.21    TCP 66  57061 → 8081 [ACK] Seq=364 Ack=343 Win=131392 Len=0 TSval=303677594 TSecr=2679189220
....
232 5.103376    192.168.1.93    192.168.1.21    TCP 425 57061 → 8081 [PSH, ACK] Seq=364 Ack=343 Win=131392 Len=359 TSval=303679295 TSecr=2679189220 [TCP segment of a reassembled PDU]
233 5.104038    192.168.1.93    192.168.1.21    HTTP/JSON   70  POST /account_groups/get_principal_shared_groups HTTP/1.1 , JavaScript Object Notation (application/json)
234 5.105231    192.168.1.21    192.168.1.93    TCP 66  8081 → 57061 [ACK] Seq=343 Ack=723 Win=64640 Len=0 TSval=2679190929 TSecr=303679295
235 5.105530    192.168.1.21    192.168.1.93    TCP 66  8081 → 57061 [ACK] Seq=343 Ack=727 Win=64640 Len=0 TSval=2679190930 TSecr=303679295
# Never returns with response

In the example above, lines 33 -> 38 is an example of the first handling - success. Same request, for the second time in short succession, causes two ACKs from actix-web, but no reply ever, until keep-alive ends or client sends FIN (my guess).

Steps to Reproduce (for bugs)

  1. Setup handler with no item to parse from the request:

pub async fn handler( //item: web::Json,

  1. Call this handler with passing something like '{}' in the body and with a client that sends ACK after a response, not ACK, FIN.
  2. Observe actix-web handler to fail to serve response after some number of attempts (can be as soon as the second one).
  3. Uncomment the parameter and observe that handler is now serving requests consistently.

Context

This is based on the error that I made while writing a handler. Though one would expect a consistent behavior of a handler, either to fail every time or to work every time. I was banding my head for a little while until coming to the conclusions on a missing json item parameter vs sending json to this handler.

Your Environment

ipostelnik commented 3 years ago

We're seeing similar behavior in our system GET requests. After a few requests the handler is not being invoked anymore and connection just hangs. If the keep-alive timer expires, the connection does get terminated, so it feels like the dispatcher doesn't pick up the service call and thinks there's nothing do. FWIW, in our PCAP we also see PSH flag on the last successfully handled request and the request packet had to be reassembled form 2 segments. I wonder if TCP stack is pushing partial or empty buffer confusing the parser.

See PCAP screenshot below: image