facebook / proxygen

A collection of C++ HTTP libraries including an easy to use HTTP server.
Other
8.16k stars 1.5k forks source link

HTTP Parsing Issue #507

Closed SteveSelva closed 3 months ago

SteveSelva commented 3 months ago

I came up with an issue of parsing data in Proxygen. I have created a HTTPDownstreamSession after a handshake. When the Session starts, it will read for data and then parses it and calls the callback functions accordingly. This works fine when the received data is a valid HTTP Protocol. When the received data is not HTTP data, instead of throwing Parsing error, it is calling onMessageBegin() callback function of the SessionInfoCallback. Why don't you check whether the data received is a valid data before calling onMessageBegin()?

afrind commented 3 months ago

The timing of an onMessageBegin is dependent on the codec used. I believe H1 and H3 will give the callback on the first byte received, and H2 will delay until it has parsed a frame it can identify as the beginning of a message. The design goes back to the underlying H1 parser we chose (originally from node.js), which behaves this way.

Is it causing a problem?

SteveSelva commented 3 months ago

Yeah, for me the underlying codec is HTTP1xCodec. And its causing the issue.

afrind commented 3 months ago

Can you explain the actual problem? You will eventually get an error after the codec parses a bit farther.

SteveSelva commented 3 months ago

For me the problem is I am facing a nearly a deadlock like issue. I have created a HTTPDownstreamSession, there at first I received a HTTP Request. The catch here is the HTTP message ends and it waits for the response, but I need some more data, which is sent through the same socket, which is not HTTP data.

So in this case what happens is, since the codec protocol is HTTP/1.1, so only one request can be processed at a time. Here I received the non HTTP data. But without throwing an error, the HTTPDownstreamSession sends a onMessageBegin callback and waits for some more data to process.

But since already an HTTP Request is received and its waiting for response. The reads in the downstream are paused since the codec is HTTP/1.1. So no new data is read and no error is thrown.

So neither the Upstream receives a HTTP Response, nor the Downstream reads some more data and throws an error, which causes in a Timeout error closing the connection.

SteveSelva commented 3 months ago

And when I check the nodejs/http-parser, they too have removed this case where the http_parser considers the first byte read as s_pre_start_req_or_res or s_pre_start_req or s_pre_start_res.

Rather they started parsing the data. Refer nodejs/http-parser https://github.com/nodejs/http-parser/blob/ec8b5ee63f0e51191ea43bb0c6eac7bfbff3141d/http_parser.c#L281

Can you update the http_parser_code like this and update proxygen too.

afrind commented 3 months ago

I need some more data, which is sent through the same socket, which is not HTTP data.

This sounds like a strange use case for HTTP. You can use HTTP upgrade to convert an HTTP/1.1 socket into a stream of bytes accessible to your application, but only after sending a 101-Switching Protocols or 200 response with the upgrade header. If you are still waiting for the response to the first request, I don't think proxygen can look at the next bytes coming on the wire without treating them as HTTP.

SteveSelva commented 3 months ago

Is it possible to update the code from https://github.com/nodejs/http-parser ?

afrind commented 3 months ago

Perhaps, but probably not on an urgent timescale. Our implementations have diverged over time and it would take us a while to do the merge and ensure that the upgrade doesn't break anything that depends on proxygen.

SteveSelva commented 3 months ago

Thanks for the response.

May I know the reason why it takes so much time to merge this PR https://github.com/facebook/proxygen/pull/506 .

afrind commented 3 months ago

I pinged the diff internally. Looks like someone wanted to tweak it slightly but it may have gotten lost in the shuffle.

SteveSelva commented 3 months ago

Oh ok.

May I know what are the new features, you are working on?

afrind commented 3 months ago

Looks like just a slightly different spelling. It should be out today I think.

SteveSelva commented 3 months ago

I am not asking about the PR, I am asking about the new features you have planned to release.

SteveSelva commented 3 months ago

Thanks for the help.