getml / reflect-cpp

A C++20 library for fast serialization, deserialization and validation using reflection. Supports JSON, BSON, CBOR, flexbuffers, msgpack, TOML, XML, YAML / msgpack.org[C++20]
https://getml.github.io/reflect-cpp/
MIT License
1.03k stars 88 forks source link

How come `to_ptr_named_tuple()` forces everything into a pointer upon writing? #218

Open dmikushin opened 5 days ago

dmikushin commented 5 days ago

Dear developers, I'm completely incompetent in modern C++. But this aspect is a strange design to me:

Suppose I'm writing a struct A { struct B b; }; into JSON. What reflect-cpp does is basically turning it into ('B', struct B* b) named tuple, using to_ptr_named_tuple(). Further on, the tuple of pointers ends up being handled by Parser_ptr.hpp, which makes back a value from the pointer:

  template <class P>
  static void write(const W& _w, const T* _ptr, const P& _parent) noexcept {
    if (!_ptr) {
      ParentType::add_null(_w, _parent);
      return;
    }
    Parser<R, W, std::remove_cvref_t<T>, ProcessorsType>::write(_w, *_ptr,
                                                                _parent);
  }

The comment says that plain pointers are bad, but the reflect-cpp itself artificially creates the use of plain pointers, and then eventually strips them off. This feels absurd, it's either something not right with my understanding or with the library design. Kindly, please help me to clarify the purpose of the to_ptr_named_tuple()-based technique. And an extra hard question: how to get rid of it? Thanks!

dmikushin commented 5 days ago

I'm investigating my concern further by trying to replace the named tuple of pointers by just a named tuple:

--- a/include/struct2json/parsing/Parser_default.hpp
+++ b/include/struct2json/parsing/Parser_default.hpp
@@ -17,7 +17,7 @@
 #include "../internal/is_literal.hpp"
 #include "../internal/is_validator.hpp"
 #include "../internal/processed_t.hpp"
-#include "../internal/to_ptr_named_tuple.hpp"
+#include "../to_named_tuple.hpp"
 #include "../to_view.hpp"
 #include "../type_name_t.hpp"
 #include "AreReaderAndWriter.hpp"
@@ -92,11 +92,11 @@ struct Parser {
         Parser<R, W, ReflectionType, ProcessorsType>::write(_w, r, _parent);
       }
     } else if constexpr (std::is_class_v<T> && std::is_aggregate_v<T>) {
-      const auto ptr_named_tuple = ProcessorsType::template process<T>(
-          internal::to_ptr_named_tuple(_var));
-      using PtrNamedTupleType = std::remove_cvref_t<decltype(ptr_named_tuple)>;
+      const auto named_tuple = ProcessorsType::template process<T>(
+          to_named_tuple(_var));
+      using PtrNamedTupleType = std::remove_cvref_t<decltype(named_tuple)>;
       Parser<R, W, PtrNamedTupleType, ProcessorsType>::write(
-          _w, ptr_named_tuple, _parent);
+          _w, named_tuple, _parent);
     } else if constexpr (std::is_enum_v<T>) {
       using StringConverter = internal::enums::StringConverter<T>;
       const auto str = StringConverter::enum_to_string(_var);

So far, I don't see anything wrong in my results.

liuzicheng1987 commented 4 days ago

Well, this function is for internal use only, end users shouldn't use it. The ptr stands for pointer.

It is necessary, because we need a structure we can actually traverse at compile time (like a named tuple), but at the same time creating deep copies of everything is not only very inefficient, but it is also impossible for some types, such as unique_ptr.

Using pointers is fine as long as we can guarantee that the thing being pointed to exists at least as long as the pointer.

When you say there is a comment that pointers are bad, what are you referring to? Where do we say this?

dmikushin commented 4 days ago

When you say there is a comment that pointers are bad, what are you referring to? Where do we say this?

   static Result<T*> read(const R&, const InputVarType&) noexcept {
     static_assert(always_false_v<T>,
                   "Reading into raw pointers is dangerous and "
                   "therefore unsupported. "
                   "Please consider using std::unique_ptr, rfl::Box, "
                   "std::shared_ptr, "
                   "rfl::Ref or std::optional instead.");
     return Error("Unsupported.");
   }

This comment about ptr Reader strongly discourages the use of pointers, however the ptr Writer is used internally.

dmikushin commented 4 days ago

It is necessary, because we need a structure we can actually traverse at compile time (like a named tuple), but at the same time creating deep copies of everything is not only very inefficient, but it is also impossible for some types, such as unique_ptr.

Makes sense, so the purpose of internal::to_ptr_named_tuple is to make named tuples without copying the data. And my patch above not only creates copies, but also should fail for std::unique_ptr.

How about making a reflect-cpp's specific internal smart pointer to definitively distinguish between the internal use of pointers and user-defined pointers? I need to free-up Parser_ptr.hpp from internal usage, because I must support C structs with plain pointers, effectively going against the comment above :)

liuzicheng1987 commented 4 days ago

When you say there is a comment that pointers are bad, what are you referring to? Where do we say this?


   static Result<T*> read(const R&, const InputVarType&) noexcept {

     static_assert(always_false_v<T>,

                   "Reading into raw pointers is dangerous and "

                   "therefore unsupported. "

                   "Please consider using std::unique_ptr, rfl::Box, "

                   "std::shared_ptr, "

                   "rfl::Ref or std::optional instead.");

     return Error("Unsupported.");

   }

This comment about ptr Reader strongly discourages the use of pointers, however the ptr Writer is used internally.

Yes, reading into pointers is strongly discouraged, because it is not clear what you expect the library to do.

Do you, the user, expect the library to allocate resources to the raw pointer that you would then have to manually delete? Or have you already allocated something and you expect the library to use that?

It is not at all clear, and both solutions are bad. The problem is that it is not clear who is responsible for handling the pointers and the resources they point to (the library or the user).

However, writing from pointers is a different story. Here it is very clear that the user is responsible for handling the pointers and the resources.

Moreover, when the library uses pointers internally, I don't really see a problem either, because it is clear that it is the library's job to manage them.

dmikushin commented 8 hours ago

Thank you for this discussion! I completely agree with you. The thing is, I'm trying to apply your technology to areas of application different from C++. You'll agree, a lot of important C code needs a similar library (and the C standard itself won't ever allow for the same functionality). Moreover, in CUDA cores, C-style pointers are mainly used by established practice, even though the CUDA compiler is C++ compatible. So, no matter how much we'd like to stay in the beautiful crystal palace of C++, to reach important applications we need to step out at least onto its porch.

liuzicheng1987 commented 5 hours ago

@dmikushin , so what you are asking for is support for C-style pointers as well?

dmikushin commented 4 hours ago

Yes. I tried to implement it, but it was blocked by the internal use of pointers. Hence is my question here.