breese / trial.protocol

Network wire protocols
11 stars 4 forks source link

Simple selective extraction of fields #16

Open vinipsmaker opened 7 years ago

vinipsmaker commented 7 years ago

Hey,

a friend of mine ( @skhaz ) showed me a very interesting JSON library for C: https://github.com/cesanta/frozen

I thought it could be useful as an inspiration design for higher-level abstractions.

breese commented 7 years ago

Interesting approach to use a printf/scanf-like API. Something similar would be fairly straightforward to create with trial.protocol.

Thanks for the reference.

vinipsmaker commented 6 years ago

Another interesting C library to parse JSONs:

But this time it doesn't serve as an inspiration to Trial.Protocol. Trial.Protocol already has integration with Boost.Serialization, so nothing missing on our side.

vinipsmaker commented 6 years ago

Another interesting C++ library. This time, the interesting bit is the format side: https://github.com/bcsanches/JsonCreator.

vinipsmaker commented 6 years ago

Now that #23 was closed, I think it'll be even easier to implement this one.

I think I'll implement this next (but it'll take a few weeks as I'll allocate some time to work on other stuff). What do you guys think about the following interface?

int x;
boost::variant<int, std::string, boost::none_t> y;
boost::optional<std::vector<int>> z;
auto h = [](const json::reader &reader) { /* a callback */ };
boost::string_view lit;

if (!json::scan(json_view,
                "foo", x,
                "bar.baz", y,
                "bar.a_list", z,
                "bar", h,
                "bar", lit)) {
  std::cerr << "missing field" << std::endl;
}

Every argument that is not boost::optional or boost::variant<boost::none_t, ...> is treated as required (i.e. json::scan will return false if this field is not found in the input json_view). If value field in json_field is null, then it is treated as if it was not there (and then only boost::optional<T> and boost::variant<boost::none_t, ...> can accept the (absence of this) value).

You can pass functors receiving const json::reader& if you want a callback called when this token is reached.

If you want a literal of some value, just pass a string_view.

The same field can be extracted into two or more targets (then you can, for instance, get a list of strings and the literal associated with this list at once).

More complex types like std::vector<boost::variant<boost::none_t, int>> are also supported.

My greatest concern is related to JSON objects like { "foo.bar": 42 } (i.e. if the user wants to read a field that has a dot). For such cases, I was thinking about using a tuple as an alternative path spec:

json::scan(json_view,
           "bar.foo", x,
           std::make_tuple("bar", "foo"), y,
           std::make_tuple("foo.bar"), z);

I like this idea because the common case (no . in fields) is less verbose and no escape syntax is required for the user to learn (but I'd probably write a json::make_path so it'd be easier for me to delve into the required TMP to make it work).

Oh, of course, trial::dynamic::variable is also supported, but I'll wait for the tree parsing feature #21 (and, as I said, I will take time to work on other stuff, so really plenty of time before I start this and no rush to merge branches).

Also, one last thing: I think I'd write a partial::scan and implement scan in terms of partial::scan.

I'll resist the temptation to propose the name magic_scan.

vinipsmaker commented 6 years ago

Just an update on my previous idea.

Therefore, I thought that this feature should be turned down. I have an update to my suggestion.

Add two functions:

partial::read_to can be easily implemented in terms of the latter.

If user wants to read some field to multiple targets at once, he'll have to opt in explicitly like this:

boost::optional<int> x;
boost::optional<std::string> y;
json::scan(json_view,
           "foo.bar", [&] (json::reader &reader) {
               return partial::read_to_many(reader, x, y);
           });

Also, callback now receives non-const reader reference. json::scan will store a pointer to current token before calling the callback. Then the callback can choose to optionally consume the token itself or return the control w/o consuming token to let json::scan itself consume it.

One more feature:

std::string current_key;
json::scan(json_view,
           "foo.bar", x,
           std::make_tuple("foo", current_key), [&](json::reader&) {
               return !current_key.starts_with("_");
           });

It should be self-explanatory. Wildcard match on fields not matched by other keys on nesting level of the path spec. std::string& should be allowed only on the tail of the path spec.

Also, this feature (callback and matching on any nesting level or specific key I want) means I no longer see a use for #26.

I'll open separate issues to track partial::read_to_many(json::reader&, T...) and partial::read_to(json::reader&, T). I can use them to ease implementation, then I'll implement them first.

breese commented 6 years ago

I still haven't given much thought to this feature, but here is some arbitrary feedback...

Why should the callback signal an error with a boolean return value rather than by throwing an exception?

While I do agree that we could avoid the dotted-string syntax, we should be careful not to create a solution that could be confused with JSONPath or JSON Pointer. These are better handled with recursive iterators on dynamic::variable.

How does read_to() differ from reader::value()?

vinipsmaker commented 6 years ago

Why should the callback signal an error with a boolean return value rather than by throwing an exception?

Using exceptions here does make sense. I'll of course change json::scan to use exceptions instead a boolean return as well. Good point.

How does read_to() differ from reader::value()?

read_to() reads a JSON value, not a JSON token:

screenshot-2018-3-24 json

The other difference is just cosmetic (result passed as a by-ref arg instead of return value).

breese commented 6 years ago

So how does read_to() differ from json::iarchive?

vinipsmaker commented 6 years ago

So how does read_to() differ from json::iarchive?

I'm not familiar with Boost.Serialization to be honest. I went to take a more serious look now that you mentioned. It seems to me that json::iarchive should do.

Sorry about the late answer.

vinipsmaker commented 6 years ago

I'm gonna focus on finishing Boost.Http parser now (expose chunk extensions and then add parsers combinators), so it'll be some time until I contribute to this project again.