boostorg / serialization

Boost.org serialization module
http://boost.org/libs/serialization
120 stars 139 forks source link

[Feature request] Automatic aggregate serialization #238

Open denzor200 opened 3 years ago

denzor200 commented 3 years ago

Hi. I tried to create simple method serialize_aggregate using boost.pfr library (new reflection library from Antony Polukhin)

template<class Archive, class Aggregate>
void serialize_aggregate(Archive& ar, Aggregate& ag)
{
    boost::pfr::for_each_field(ag, [&ar](auto&& field) {
        ar & field;
    });
}

This method allows me to describe serializable entities more compact:

struct gps_position
{
    int degrees;
    int minutes;
    float seconds;
    template<class Archive>
    void serialize(Archive& ar, const unsigned int /* file_version */) {
        serialize_aggregate(ar, *this);
    }
};
std::ostream& operator<<(std::ostream& os, const gps_position& gp)
{
    return os << ' ' << gp.degrees << (unsigned char)186 << gp.minutes << '\'' << gp.seconds << '"';
}

Why not support for this method on the library side?

denzor200 commented 3 years ago

unfortunately, i have no idea how can a handle version arg with this new style((

correaa commented 3 years ago

Great idea. One problem is that serialization for certain Archives in Boost.Serialization (e.g. XML) requires an extra level of instrospection to get the field name or, more simply, at least have a way to generate names automatically if not provided (Automatic tag generation in my opinion is something that the library should provide by default.)

Second, I wouldn't bother with versioning aggregates. My perspective is that aggregates shouldn't be subject to versioning because in that case you have to start defaulting values which is contrary to the idea of an aggregate. (That is the price of using aggregates).

denzor200 commented 3 years ago

One problem is that serialization for certain Archives in Boost.Serialization (e.g. XML) requires an extra level of instrospection to get the field name or, more simply, at least have a way to generate names automatically if not provided (Automatic tag generation in my opinion is something that the library should provide by default.)

struct gps_position
{
    field<int, name="degrees"> degrees;
    field<int, name="minutes"> minutes;
    field<float, name="seconds"> seconds;
    template<class Archive>
    void serialize(Archive& ar, const unsigned int /* file_version */) {
        serialize_named_aggregate(ar, *this); ///< TODO: implement serialize_named_aggregate
    }
};
std::ostream& operator<<(std::ostream& os, const gps_position& gp)
{
    return os << ' ' << gp.degrees << (unsigned char)186 << gp.minutes << '\'' << gp.seconds << '"';
}

We need to:

  1. field<T, A> template class, this class is just a wrapper to T with std::reference_wrapper-like(??) iface.
  2. name object with overloaded operator=(std::string_view), and this overloaded operator must be constexpr
robertramey commented 3 years ago

After taking a short look at pfr documentation, Wouldn't something like this be as simple as possible?

template<class Archive, class Aggregate>

void boost::serialization::serialize(Archive& ar, Aggregate& ag){ boost::pfr::for_each_field(ag, [&ar](auto& field) { ar & field; } ');`

PS - how did you get the well formatted code into the GitHub comment?

denzor200 commented 3 years ago

Wouldn't something like this be as simple as possible?

Yes, rvalue reference in my example is redundant. Lvalue enough.

how did you get the well formatted code into the GitHub comment?

  1. Paste your code
  2. Select your pasted code
  3. Press "Insert code <ctrl+e>"
  4. Look at the well formatted code in the "Preview" section
denzor200 commented 3 years ago

Finally, i successfully implemented field<T, A> and name for C++17. Interface for field<T,A> is not ideal, but this short example shows that structure with named fields are possible: http://godbolt.org/z/qz3TPo8Pb

denzor200 commented 3 years ago
template<class Archive, class Aggregate>
void serialize_named_aggregate(Archive& ar, Aggregate& ag)
{
    boost::pfr::for_each_field(ag, [&ar](auto& field) {
        using TField = std::decay_t<decltype(field)>;
        // The field_t<T,A>::name() method guarantees us that the returned std::string_view always refers to a zero-terminated string.
        ar& boost::serialization::make_nvp(TField::name().data(), field.data());
    });
}
denzor200 commented 3 years ago
#ifdef _STR
#  error _STR already defined
#endif
#define _STR(S) BOOST_METAPARSE_STRING(S){}

struct gps_position
{
    field<int, (name = _STR("degrees"))> degrees;
    field<int, (name = _STR("minutes"))> minutes;
    field<float, (name = _STR("seconds"))> seconds;

    template<class Archive>
    void serialize(Archive& ar, const unsigned int /* file_version */) {
        serialize_named_aggregate(ar, *this);
    }
};

// TODO: use BOOST_PFR_FUNCTIONS_FOR
std::ostream& operator<<(std::ostream& os, const gps_position& gp)
{
    return os << ' ' << gp.degrees.data() << (unsigned char)186 << gp.minutes.data() << '\'' << gp.seconds.data() << '"';
}
robertramey commented 3 years ago

Looking at this a little more, I've concluded that it's kind of tricky to do right. In any case, I think it would be a much better addition to the PFR library than the serialization library.

correaa commented 3 years ago

Looking at this a little more, I've concluded that it's kind of tricky to do right. In any case, I think it would be a much better addition to the PFR library than the serialization library.

What exactly do you think it is hard to get right? i) Serializing aggregates in general or ii) generating the named-value pairs for specific archives that need them.

I think that serializing aggregates is well defined an this approach should work, although will make Boost Serialization depend on Boost PFR (which is c++14 only I think, maybe in a couple of versions of the languages we will not even depend PFR to do this.)

I think the problem of naming fields is more serious. In any case, using PFR or not (in some way), a key enabling feature in Boost Serialization would be to relax the requirement that XML archives (and therefore generic archives in general) use named-value pairs and that the tag names are generated automatically when needed.

I brought up this issue a couple of times before. The need for named-value pairs right now generates an spurious and early dependency on Boost (or Boost Serialization) even before one decides to use Boost Serialization at all with a class). Note that one could write serialization code generically without including any Boost library. However when one needs to make this generic code work (in a possible future) with XML-like archives, one needs to depend on including a Boost header (nvp).

There are two solutions to this problem 1) make named value pairs optional (tags with names are nice to have but not a requirement). 2) make XML-like archives take a generic named value pair-like class, rather than an specific hard coded type. Solution 1) would make this class field unnecessary and can work with any version of C++. Solution 2) would need some sfinae-magic that is very hard to get right in C++98 but trivial in C++11 as we discussed earlier. 1) and 2) are not exclusive.