Stiffstream / restinio

Cross-platform, efficient, customizable, and robust asynchronous HTTP(S)/WebSocket server C++ library with the right balance between performance and ease of use
Other
1.15k stars 93 forks source link

An idea about http_field_parser helper for RESTinio #57

Closed eao197 closed 4 years ago

eao197 commented 4 years ago

There are no helpers for dealing with values of HTTP-fields in RESTinio at the current moment. For example, if someone wants to handle file upload request, then he/she has to deal with the value of 'Content-Type' HTTP-field by him/herself. It means that the value of 'Content-Type' should be manually parsed, the presence of 'multipart/form-data" should be checked and the fragment with 'boundary' should be found and handled.

I think it's not a good situation because RESTinio is intended to be easy-to-use but still a rather performant library. So I have an idea about adding some simple helpers for parsing values of HTTP-fields. And I want to discuss some design decisions here.

There is just a sketch of how it can look like:

#include <restinio/all.hpp>
#include <restinio/helpers/http_field_parser.hpp> // NOTE: additional header!

void handle_request(const restinio::request_handle_t & req) {
   // We expect that Content-Type header is present.
   const auto content_type = req->header().value_of(restinio::http_field::content_type);

   // Try to ensure that it has 'multipart/form-data' value and 'boundary' fragment.
   std::string boundary; // A receiver for the value of 'boundary'
   if(!restinio::http_field_parser::try_parse_field_value(content_type,
         ';', // The separator that should separate fragments of HTTP-field values.
         restinio::http_field_parser::expect("multipart/form-data"),
         restinio::http_field_parser::name_value("boundary", boundary)))
      throw std::runtime_error("Unexpected value of 'Content-Type'!");

   ... // Now boundary variable contains the value of 'boundary' fragment.
}

This call to try_parse_field_value allows to parse the content of field in the form multipart/form-data; boundary=Some_Boundary_Value or multipart/form-data; boundary="Some Boundary Value".

There can be also a function try_parse_whole_field that allows to parse the whole HTTP-field, without stripped field header (as in the previous example):

std::string whole_field = "Content-Disposition: form-data; name=\"file-name\"; filename=\"demo.txt\"";
std::string name;
std::string filename;
if(restinio::http_field_parser::try_parse_whole_field(whole_field,
      "content-disposition", // Expected field name.
      ';', // The separator that should separate fragments of HTTP-field values.
      restinio::http_field_parser::expect("form-data"),
      restinio::http_field_parser::name_value("name", name),
      restinio::http_field_parser::name_value("filename", filename)))
{
   assert("file-name" == name);
   assert("demo.txt" == filename);
   ...
}

So the first question is: is this set of helper functions good enough as a starting point?

The second question is related to the way of acquiring a value from name_value(). At the current moment the value is stored to a variable of type std::string (reference to that variable is passed to name_value() function).

But that design has at least two drawbacks:

The first one is rewriting of the current value of the variable even in the case when try_parse_* function returns false. For example:

std::string whole_field = "Content-Disposition: form-data; name=\"file-name\"";
std::string name;
std::string filename;
if(restinio::http_field_parser::try_parse_whole_field(whole_field,
      "content-disposition", // Expected field name.
      ';', // The separator that should separate fragments of HTTP-field values.
      restinio::http_field_parser::expect("form-data"),
      restinio::http_field_parser::name_value("name", name),
      restinio::http_field_parser::name_value("filename", filename)))
{
   assert("file-name" == name);
   assert("demo.txt" == filename);
   ...
}

In that case try_parse_whole_field will return false, but name variable will hold file-name value after the return from try_parse_whole_field. It's a consequence of parser working scheme.

The second drawback is inability to work with other types of destinations. For example, someone may want to use std::vector<char> instead of std::string. Or something else. So maybe there should be a version of name_value that accepts inserter:

std::vector<char> boundary; // A receiver for the value of 'boundary'
if(!restinio::http_field_parser::try_parse_field_value(
      req->header().value_of(restinio::http_field::content_type),
      ';',
      restinio::http_field_parser::expect("multipart/form-data"),
      restinio::http_field_parser::name_value("boundary", std::back_inserter(boundary))))
...

But there is another approach for returning parsed values. It seems that try_parse_* can return a tuple where the first item will be bool and all following items will be std::string. In that case we can write code like that:

std::string whole_field = "Content-Disposition: form-data; name=\"file-name\"; filename=\"demo.txt\"";
const auto result = restinio::http_field_parser::try_parse_whole_field(whole_field,
      "content-disposition", // Expected field name.
      ';', // The separator that should separate fragments of HTTP-field values.
      restinio::http_field_parser::expect("form-data"),
      restinio::http_field_parser::name_value("name"),
      restinio::http_field_parser::name_value("filename"));
if(std::get<0>(result))
{
   assert("file-name" == std::get<1>(result));
   assert("demo.txt" == std::get<2>(result));
   ...
}

or, in the case of C++17 it can look like that:

std::string whole_field = "Content-Disposition: form-data; name=\"file-name\"; filename=\"demo.txt\"";
const auto & [parsed, name, filename] =
   restinio::http_field_parser::try_parse_whole_field(whole_field,
      "content-disposition", // Expected field name.
      ';', // The separator that should separate fragments of HTTP-field values.
      restinio::http_field_parser::expect("form-data"),
      restinio::http_field_parser::name_value("name"),
      restinio::http_field_parser::name_value("filename"));
if(parsed)
{
   assert("file-name" == name);
   assert("demo.txt" == filename);
   ...
}

Personally I would prefer a version that returns a tuple. But it will be great to hear other opinions.

So any constructive suggestions are welcome Please tell me what you think.

eao197 commented 4 years ago

A different approach is implemented in v.0.6.1.