handling of route - Githubissues

craff commented 1 year ago

I have made several improvements (?) to route handling in simple_httpd you might want to use for tiny_httpd:

[ ] I have fixed an order on route: more precise first (currently, it is reverse of the order of add_route)
[ ] I dispatch using a tree for the method and the exact prefix of the route.
[ ] if no method is specified, it means GET|POST|HEAD but not PUT and DELETE
[ ] I removed the distinction between url_encode or not, I think it is simpler to always accept url_encoded path ?

c-cube commented 1 year ago

Do you still use a GADT for typing paths? wrt the tree?

Otherwise: I don't like removing the url_encode choice. Something you may want a raw string, and sometimes not. For the most part I'm worried I'll have to modify my existing projects that use tiny_httpd…

craff commented 1 year ago

Le 13 février 2023 04:46:55 GMT-10:00, Simon Cruanes @.***> a écrit :

Do you still use a GADT for typing paths? wrt the tree?

I mostly kept your code. I just added a tree structure to handle method and exact prefix.

You can have a look at simple_httpd and steal what you like.

Otherwise: I don't like removing the url_encode choice. Something you may want a raw string, and sometimes not. For the most part I'm worried I'll have to modify my existing projects that use tiny_httpd…

I was considering an optimisation of parsing the first line of request as I did for The others. Then, it will include URL decode.

By the way, I now have an ADT for header names generated from a CSV from iana. Like this I caught the issue in the test.

Now I only alocate once the header value ( I integrate the trim in parsing). Your code allocated three times the line of each header: The full line, the header name before and after lowercase conv and the header value before and after trim. Simplifying this did improve speed a lot.

-- > Reply to this email directly or view it on GitHub:

https://github.com/c-cube/tiny_httpd/issues/60#issuecomment-1428062068

You are receiving this because you authored the thread.

Message ID: @.***>

c-cube commented 1 year ago

By the way, I now have an ADT for header names generated from a CSV from iana.

For common headers? That's pretty cool.

c-cube commented 1 year ago

Serious question: at what point is simple_httpd just a different project, anyway? It feels like you're doing a lot of work on it, it's faster, it leverages OCaml 5, etc. Maybe it's simpler to make simple_httpd an official project of yours?

craff commented 1 year ago

I decided to accept only the headers listed at iana. The are so many that I am sure one can find one with a reasonable name if you need something.

Le 13 février 2023 06:35:45 GMT-10:00, Simon Cruanes @.***> a écrit :

By the way, I now have an ADT for header names generated from a CSV from iana.

For common headers? That's pretty cool.

-- Reply to this email directly or view it on GitHub: https://github.com/c-cube/tiny_httpd/issues/60#issuecomment-1428256092 You are receiving this because you authored the thread.

Message ID: @.***>

craff commented 1 year ago

Le 13 février 2023 06:37:16 GMT-10:00, Simon Cruanes @.***> a écrit :

Serious question: at what point is simple_httpd just a different project, anyway? It feels like you're doing a lot of work on it, it's faster, it leverages OCaml 5, etc. Maybe it's simpler to make simple_httpd an official project of yours?

Yes, it is what I think I did. But I still feel like reporting bugs and suggestions for tiny_httpd, as we still share a lot of code.

c-cube commented 1 year ago

You mean, you don't accept custom headers at all? :o

Thank you for the reports. Indeed, the ones about chunk encoding, etc. are very useful. I need to work on them.

craff commented 1 year ago

Simon Cruanes writes:

You mean, you don't accept custom headers at all? :o

Yes I don't. I have never add any use for them (contrary to custom cookies), and if you accept any custom header, then you do not catch spelling mistake in the frontend code (like for the test).

In fact I want to propose some (simple) parser combinator for input buffers, as a ppx, not only for speed of parsing, but to put the header values in a record and avoid the assoc list. I will give a default parser for headers that can be extended both for the server and each route.

This way you can ignore the header you don't use, but still check they are valid and I can accept custom headers and have a clear error in the log if there is an invalid header. THat't the plan, but I will probably release simple_httpd without this feature.

Thank you for the reports. Indeed, the ones about chunk encoding, etc. are very useful. I need to work on them.

The spec on Accept-encoding and Content-encoding are a bit strange.

Accept-encoding only mention some compression algorithm and not chunked or identity. This means all clients and servers must accept chunked-encoding I guess? So the current code is probably ok (I checked that after reporting the issue) and only the test should be corrected, putting no header (and not Accept-encoding chunked which appears to be invalid).

I think chunked encoding should be reserved to the case where it is costly or impossible to compute the body length. So there is probably no way for the client to force the server to send chunked encoded body?

-- Christophe Raffalli tél: +689 87 23 11 48 web: http://raffalli.eu

c-cube commented 1 year ago

Is it compliant to reject custom headers? Or do you mean you just throw the parsing result away and move on to the next line?

I actually like the chunked encoding. If your body is a stream, it's just the cleaner way to go. And clients should be able to process either fixed size or chunked bodies anyway.

craff commented 1 year ago

Simon Cruanes writes:

Is it compliant to reject custom headers? Or do you mean you just throw the parsing result away and move on to the next line?

I did not think about that. I will do what you suggest: rejecting the line and continue + message in the log. Currently I was issuing bad request, but your idea is better.

I actually like the chunked encoding. If your body is a stream, it's just the cleaner way to go. And clients should be able to process either fixed size or chunked bodies anyway.

Yes I like it too. And at last I found a clear paragraph in the spec:

""" 7.4. Negotiating Transfer Codings

The TE field (Section 10.1.4 of [HTTP]) is used in HTTP/1.1 to indicate what transfer codings, besides chunked, the client is willing to accept in the response and whether the client is willing to preserve trailer fields in a chunked transfer coding.

A client MUST NOT send the chunked transfer coding name in TE; chunked is always acceptable for HTTP/1.1 recipients. ... """

Still they mention TE which seems to be redundant with Accept-Encoding.

I understand this as "the server can use chunked or content-length and the client as no way to choose what the servor will do contrary to compressions".

-- Christophe Raffalli tél: +689 87 23 11 48 web: http://raffalli.eu

c-cube commented 7 months ago

Hi @craff, I know it's been a while, but I've been revisiting issues :). I'd be interested in some parts of this actually!

using a tree for routing (if it keeps the same GADT)
always have urlencoded strings

for the use of combinators, I'd be interested in seeing what they look like/how they're implemented but I think it belongs in a separate discussion.

c-cube / tiny_httpd

handling of route #60