Open thierry-f-78 opened 3 months ago
Hi @thierry-f-78
I love many things in this approach, although the API still seems somewhat problematic.
I love the addition of fio_router_set_middleware
and would rename it to fio_http_router_middleware_push
(like a URL, climb the name space until we reach the action).
Also, if the router is a structure, it should end with the _s
suffix (_t
is a reserved suffix and shouldn't be used unless writing POSIX library code).
Also, I would love to make the route addition even simpler if possible, making it simple to define resource routes in a way that is similar to what they do with Ruby on Rails.
It's an idiomatic use case that I encountered a lot when designing business apps.
I am missing details as well as implementation constraints to what I believe would be the ideal approach, but following what you suggest so far it might look like this:
/** The HTTP Router type*/
typedef struct fio_http_router_s fio_http_router_s;
/** Named arguments for the fio_http_router_map() functions. */
typedef struct {
/** Default opaque user data to be set for the HTTP handle (fio_http_s). */
void *udata;
/* ---- Catch-All Callback (except authentication) ---- */
/** Default handler. Missing methods will return a 404 error. */
int (*on_http)(fio_http_s *h);
/* ---- Route Filtering ---- */
/**
* If the `accept` header doesn't list this content type, route fails.
*
* This also sets the default response content-type header.
*/
fio_buf_info_s content_type;
/* ---- CRUD Callbacks ---- */
/** GET request handler. */
int (*on_get)(fio_http_s *h);
/** POST request handler. */
int (*on_post)(fio_http_s *h);
/** PUT / PATCH request handler. */
int (*on_put)(fio_http_s *h);
/** DELETE request handler. */
int (*on_delete)(fio_http_s *h);
/* ---- Parameter Handling Callbacks ---- */
/** Parameter parsing callback. */
int (*on_parameter_true)(fio_http_s *h, fio_buf_info_s name);
/** Parameter parsing callback. */
int (*on_parameter_false)(fio_http_s *h, fio_buf_info_s name);
/** Parameter parsing callback. */
int (*on_parameter_empty)(fio_http_s *h, fio_buf_info_s name);
/** Parameter parsing callback. */
int (*on_parameter_str)(fio_http_s *h,
fio_buf_info_s name,
fio_buf_info_s value);
/** Parameter parsing callback. */
int (*on_parameter_int)(fio_http_s *h, fio_buf_info_s name, int64_t value);
/** Parameter parsing callback. */
int (*on_parameter_float)(fio_http_s *h, fio_buf_info_s name, double value);
/* ---- WebSockets / SSE for specified path ---- */
/** Authenticate EventSource (SSE) requests, return non-zero to deny.*/
int (*on_authenticate_sse)(fio_http_s *h);
/** Authenticate WebSockets Upgrade requests, return non-zero to deny.*/
int (*on_authenticate_websocket)(fio_http_s *h);
/** Called once a WebSocket / SSE connection upgrade is complete. */
void (*on_open)(fio_http_s *h);
/** Called when a WebSocket message is received. */
void (*on_message)(fio_http_s *h, fio_buf_info_s msg, uint8_t is_text);
/** Called when an EventSource event is received. */
void (*on_eventsource)(fio_http_s *h,
fio_buf_info_s id,
fio_buf_info_s event,
fio_buf_info_s data);
/** Called when an EventSource reconnect event requests an ID. */
void (*on_eventsource_reconnect)(fio_http_s *h, fio_buf_info_s id);
/** Called for WebSocket / SSE connections when outgoing buffer is empty. */
void (*on_ready)(fio_http_s *h);
/** Called for open WebSocket / SSE connections during shutting down. */
void (*on_shutdown)(fio_http_s *h);
/** Called after a WebSocket / SSE connection is closed (for cleanup). */
void (*on_close)(fio_http_s *h);
/**
* A public folder for file transfers - allows to circumvent any application
* layer logic and simply serve static files.
*
* Supports automatic `gz` pre-compressed alternatives.
*/
fio_str_info_s public_folder;
/**
* The max-age value (in seconds) for possibly caching static files from the
* public folder specified.
*
* Defaults to 0 (not sent).
*/
size_t max_age;
} fio_http_router_settings_s;
/**
* Note, if `router_or_sub_router` is NULL, a new router will be created.
*/
fio_http_router_s *fio_http_router_map(fio_http_router_s *router_or_sub_router,
const char *url_with_params,
fio_http_router_settings_s settings);
#define fio_http_router_map(router, url, ...) \
fio_http_router_map(router, url, (fio_http_router_settings_s){__VA_ARGS__})
Such an approach could set multiple related routes with one function call:
void example_use_simple(void) {
fio_http_router_s *router;
router = fio_http_router_map(NULL,
"/",
.on_http = my_404_handler,
.on_get = welcome_home);
fio_http_router_map(router,
"/users",
.on_get = users_list,
.on_post = users_new_or_update,
.on_put = users_404_handler,
.on_delete = users_404_handler);
fio_http_router_map(router,
"/users/:id",
.on_get = users_show,
.on_post = users_new_or_update,
.on_put = users_update,
.on_delete = users_remove);
/* '*' in root will catch all. */
fio_http_router_map(router, "*", .on_http = my_404_handler);
}
Or:
void example_use_with_sub_routes(void) {
fio_http_router_s *router;
fio_http_router_s *users_router;
router = fio_http_router_map(NULL,
"/",
.on_http = my_404_handler,
.on_get = welcome_home);
users_router = fio_http_router_map(router,
"/users",
.on_get = users_list,
.on_post = users_new_or_update,
.on_put = users_404_handler,
.on_delete = users_404_handler);
fio_http_router_map(users_router,
"/:id",
.on_get = users_show,
.on_post = users_new_or_update,
.on_put = users_update,
.on_delete = users_remove);
/* '*' in root will catch all. */
fio_http_router_map(router, "*", .on_http = my_404_handler);
}
And perhaps we could support "optional" parameters:
void example_use_with_optional_parameters(void) {
fio_http_router_s *router;
router = fio_http_router_map(NULL,
"/",
.on_http = my_404_handler,
.on_get = welcome_home);
fio_http_router_map(router,
"/users(:id)", /* optional :id parameter */
.on_get = users_list_or_show, /* behaves differently if `id` was supplied*/
.on_post = users_new_or_update,
.on_put = users_update,
.on_delete = users_remove);
/* '*' in root will catch all. */
fio_http_router_map(router, "*", .on_http = my_404_handler);
}
Of course, how do we link the fio_http_router_s
with the fio_http_listen
is still a mystery for me, but... :)
As for the implementation details:
Using a tree with a method at the root of that tree would break possible custom method names.
This happens and we have to support it for one of our clients, when they have special (non-standard) method names sent by their client-side app. This is what we should allow a catch-all for each path that is method name agnostic.
As for the default behavior: I would like to allow developers to define a catch-all route.
I would also like to find a solution to possible language defying routes and similar optional segments (i.e., /(:lang)
), where that segment is ignored by subsequent routes... i.e.:
void example_use_of_optional_parameter_at_start_of_url(void) {
/* set a catch all for all routes. Maybe this should*/
fio_http_router_s *router;
router = fio_http_router_map(NULL,
"/(:lang=\?\?)", /* 2 letters and optional */
.on_http = test_language_or_ignore,
.on_parameter_str = set_language);
fio_http_router_map(router,
"/", /* performed after `lang` was set */
.on_http = my_404_handler,
.on_get = welcome_home);
/* '*' in root will catch all. */
fio_http_router_map(router, "*", .on_http = my_404_handler);
}
Once an optional segment was extracted, the rest of the routers should ignore the extracted segment.
In other words, if an HTTP handler callback returns an error, the next matching route should be tested. Any consumed parameters should be ignored when continuing to the next route.
I love the addition of fio_router_set_middleware and would rename it to fio_http_router_middleware_push (like a URL, climb the name space until we reach the action).
👍
Also, if the router is a structure, it should end with the _s suffix (_t is a reserved suffix and shouldn't be used unless writing POSIX library code).
👍
Also, I would love to make the route addition even simpler if possible, making it simple to define resource routes in a way that is similar to what they do with Ruby on Rails.
CRUD is designed to reflect basic database operations: SELECT/GET, INSERT/POST, UPDATE/PUT and DELETE/DELETE. This is very useful in some applications, but greatly limits the possibilities. If you limit the use of a router to CRUD, it excludes everything that isn't CRUD, which is 3/4 of the internet.
For example:
In my opinion, a router should allow implementing CRUD, but shouldn't be limited to that, otherwise it can't be widely adopted. The router should allow the user to do what they want.
We can also note the (non-exhaustive) specified methods are: GET, POST, PUT, DELETE, HEAD, OPTIONS, TRACE (RFC 7231), PATCH (RFC 5789), PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK (RFC 4918).
And most importantly, RFC 7231 states: "HTTP methods are extensible. HTTP implementations MUST be prepared to process methods they do not recognize as being safe and idempotent methods..."
So I think the methods should not be limited and hard-coded, but left to the developer's initiative.
Moreover, most developments do not respect these standards, at least not completely. In a structured project, you always find an exception. You also find legacy code to respect.
It's preferable to handle 404 and 405 with fallback functions.
Using a tree with a method at the root of that tree would break possible custom method names.
Not necessarily, the method at the root is not hard-coded. The root can be a sorted list of methods in the form of strings. Alphabetical ordering is done at compilation and allows for a binary search to match the method (Maybe unnecessary, because for 10 entries it might be faster to traverse a list than to do a binary search). Other approaches may consist of quickly processing standard methods (GET, POST, ...) and using a fallback method for the rest.
This happens and we have to support it for one of our clients, when they have special (non-standard) method names sent by their client-side app. This is what we should allow a catch-all for each path that is method name agnostic.
👍
As for the default behavior: I would like to allow developers to define a catch-all route.
👍
Once an optional segment was extracted, the rest of the routers should ignore the extracted segment.
This is exactly the role of sub-routers. The goal is to ignore what has already been processed by the parents.
In other words, if an HTTP handler callback returns an error, the next matching route should be tested. Any consumed parameters should be ignored when continuing to the next route.
Indeed, each registered segment is evaluated in the order of registration, and if an evaluation fails, we move on to the next one. If all evaluations fail, we look for a fallback for the router.
Return of the routing function
I wonder if the routing function shouldn't return an array with the list of functions to execute rather than calling callbacks. The advantages are:
The router doesn't manage the execution of the request, it just provides what needs to be executed.
The developer implements their execution as they wish. The two main execution modes are asynchronous/non-blocking and synchronous/blocking. It is very difficult to make these two modes coexist in the same API. The non-blocking mode is rather complex in C, especially for an application server that will make numerous API calls and database accesses. The synchronous system is simpler, but less performant.
Error processing
The error reporting system should also be worked on. The idea is to carry in the function prototypes a system of argumentative error reporting. Something like this:
int a(char **err) {
if (detect-error) {
explain_error(err, "error in 'a' at char %d", line);
return -1;
}
}
int b(char **err) {
if (a(err) == -1) {
explain_error(err, "error in 'b': %s", *err);
}
}
int c(char **err) {
if (b(err) == -1) {
explain_error(err, "error in 'c': %s", *err);
}
}
int main() {
char *err = NULL;
if (c(&err) == -1) {
printf("error: %s\n", err)
}
}
The explain_error() function takes care of crafting the error messages and especially making them inherit from each other in order to make a complete message easy to understand for the developer (or the application administrator).
In the above example, the returned message is:
"error: error in 'c': error in 'b': error in 'a' at char 22\n"
Of course, this message is ugly, the idea is just to illustrate the inheritance of errors.
So I think the methods should not be limited and hard-coded, but left to the developer's initiative.
I agree, but I also think that some idiomatic patterns should be easier to implement.
The OPTIONS and PATCH methods could definitely be added as independent callbacks, but I suggest that everything that is less idiomatic can be handled in the on_http
callback that can check methods and perform additional routing.
Also, we could easily add method
to the filter options:
typedef struct {
/** ....... */
/* ---- Route Filtering ---- */
/**
* If the `accept` header doesn't list this content type, route fails.
*
* This also sets the default response content-type header.
*/
fio_buf_info_s content_type;
/**
* Filters away any requests in which the HTTP method isn't one of the methods in this comma separated buffer.
*/
fio_buf_info_s methods;
/** ....... */
} fio_http_router_settings_s;
It's preferable to handle 404 and 405 with fallback functions.
I agree, but sometimes you want to handle a 404 inside a relative path, such as within an admin console or other part of the app.
After all routes fail, 404 will be the obvious response, but other than that it would be nice to have the option to answer with 404 while still keeping a user within a specific scope.
This is exactly the role of sub-routers. The goal is to ignore what has already been processed by the parents.
Yes, but sub-routers ignore other possible behaviors, such as middleware that re-writes the path (i.e., if someone implemented a language middleware in some way).
I wonder if the routing function shouldn't return an array with the list of functions to execute rather than calling callbacks. The advantages are:
As a rule, I want to place a little trust as we can in the developer's hands. Trust in developers – especially when it comes to teams and each of them finding a different solution / approach to the same issue – is often a cause for frustration and user-related bug reports.
I believe it is our responsibility, as authors, to provide the safest and most novice friendly (user-proof) way to handle things.
This is why I think that we should run all callbacks in the router and stop the callback cain if one of them returns a non-zero value.
Callbacks could use the fio_env_set
for cleanup code (as the cleanup is called when the connection is disconnected) or use the on_finish
and on_open
callbacks for response specific cleanup.
Middleware is obviously biased towards linear code, but if we control the middleware design, we can make sure cleanup is always performed.
Error processing
That's something I have yet to think about.
In general, there's a difference between parser / library detected errors (we need to explain) and errors the user reports (they should log the error so they know what happened, we just need to know an error occurred).
My tendency is to provide error logging using FIO_LOG_XXXX
(i.e., FIO_LOG_WARNING
or FIO_LOG_ERROR
or FIO_LOG_FATAL
) and return the error without providing access to the specifics of the error. This isn't always ideal and it is context specific, but that's so far the main approach.
I believe code should simply run or simply break.
Code that breaks should be secure – i.e., breaking on external input should never lead to memory leaks or a crash while internal code errors should prevent compilation (ideal) or prevent the server from starting (crash the program during the setup stage).
Following this thread : https://github.com/boazsegev/facil.io/issues/155
After reading fio-stl, I better understand your position in general, and more particularly regarding regexps.
I therefore identify the wrappers strtol for integers and glob for wildcards. I will remove the regexps (anyway, regexes are heavy, slow and difficult to debug, but they make life easier for the end user). The other types I propose are trivial.
However, I don't see anything in fio-stl for timestamps.
Here's a proposed API.
I wonder if the router shouldn't return a list of "void *" that the user can execute as they want (blocking or non-blocking).
I wonder if the error reporting system should be rethought. Indeed, when parsing something complex, the user needs an error message that is more elaborate than "ok" / "nok".