rexyai / RestRserve

R web API framework for building high-performance microservices and app backends
https://restrserve.org
271 stars 31 forks source link

Header Content is split on Comma #187

Closed DavZim closed 2 years ago

DavZim commented 2 years ago

All header values are split on comma. This breaks for example the If-Modified-Since functionality of ETags as it uses the (ugly but sadly standard) "Tue, 15 Mar 2022 10:27:07 GMT" format.

According to this not all headers should be split on comma.

This is not present when RestRserve::Request and app$process_request(req) is used for some reasons.

Is this intended or a bug/not-yet-implemented-feature?

If confirmed, I am happy to help.

Replication

Replicate the issue with the following:

library(RestRserve)

print_headers = Middleware$new( 
  process_request = function(.req, .res) print(.req$headers),
  process_response = function(.req, .res) {},
  id = "print_headers"
)
app = Application$new()
app$append_middleware(print_headers)
app$add_get("/foo", FUN = function(.req, .res) .res$body = 42)

# Test using RestRserve::Request works
req = Request$new(
  path = "/static/example-text-file",
  method = "GET",
  headers = list("If-Modified-Since" = "Tue, 15 Mar 2022 10:27:07 GMT")
)

# the header is not split on the comma! We get one date - CORRECT!
rs = app$process_request(req)
#> $`if-modified-since`
#> [1] "Tue, 15 Mar 2022 10:27:07 GMT"

# start the server
backend = BackendRserve$new()
backend$start(app, http_port = 8080)

Now we can see the issue either by using curl:

curl http://localhost:8080/foo -H "If-Modified-Since: Tue, 15 Mar 2022 10:27:07 GMT"

which results in

#> $`if-modified-since`
#> [1] "Tue"                      "15 Mar 2022 10:27:07 GMT"

Note that providing the header as "If-Modified-Since: \"Tue, 15 ...\"" does not change the outcome.

or by using httr:

httr::GET("http://localhost:8080/foo",
          httr::add_headers("If-Modified-Since" = "Tue, 15 Mar 2022 10:27:07 GMT"))

which also results in

#> $`if-modified-since`
#> [1] "Tue"                      "15 Mar 2022 10:27:07 GMT"
dselivanov commented 2 years ago

wild wild http =) Yes, seems not all http headers should be split on comma. Would be great if you could investigate.

artemklevtsov commented 2 years ago

Is there any RFC where this is clearly described? We can skip split headers by default as made in some other frameworks. Otherwise we should support whitelist of the header names.

DavZim commented 2 years ago

I will have a look at it.

Do you know why the RestRserve::Request is handled differently than the curl request?

dselivanov commented 2 years ago

Not sure - need to see source code

artemklevtsov commented 2 years ago

Here: https://github.com/rexyai/RestRserve/blob/master/src/parse_headers.cpp#L36-L42

DavZim commented 2 years ago

A quick note on why the two (BackendRserve$new()... vs Application$process_request) handle the headers differently (for the moment I am looking at the way middleware handles the headers in process_request) from what I understand at the moment:

This makes it hard to test as I don't know of a way to test the library with BackendRserve, therefore the error is not possible to trigger in the test environment.

Regardless, I will work on making the cpp_parse_headers code dependent on header names.

dselivanov commented 1 year ago

Don't remember, need to see the source code

On Thu, 14 Apr 2022, 17:21 DavZim, @.***> wrote:

I will have a look at it.

Do you know why the RestRserve::Request is handled differently than the curl request?

— Reply to this email directly, view it on GitHub https://github.com/rexyai/RestRserve/issues/187#issuecomment-1098916224, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHC5XLDX33TTO4XZUFHMWTVE7PSZANCNFSM5TM67SPQ . You are receiving this because you commented.Message ID: @.***>