WhiZTiM / UbjsonCpp

A high performance C++14 library for effortlessly reading and writing UBJSON
24 stars 11 forks source link

Add support for strongly typed arrays. #9

Open tnovotny opened 6 years ago

tnovotny commented 6 years ago

The support for homogeneous arrays is missing. see StreamWriter<StreamType>::append_array

    //! \todo TODO... detect homogeneous arrays and treat accordingly
    //! WORK in progress

I think you made this difficult because you got the binary type wrong. I.E there is not one binary type, but all the strong typed arrays are different binary types. One should be able to write something like:

std::vector<int> ints = foo();
ubjson::Value value;
value["ints"] = ints;

and that should serialize into

[{][i][4][ints][[][$][I][#][i][5][1][2][3][4][5]

Note the [$][I]. The type of the array should be kept as passed in so that it is received that way.

WhiZTiM commented 6 years ago

You are right. In other words:

std::vector<unsigned char> data = getBytes();
ubjson::Value value;
value["data"] = data;

should be the "way" for writing binary types. Good!. My problem is reading it. The current way this library stores array is in form of:

    //! A pattern type alias, i.e (std::unique_ptr<Value>)
    using Uptr = std::unique_ptr<Value>;

    //! An alias used to internally represent \ref Type "Array" types
    using ArrayType = std::vector<Uptr>;

few lines after this

Which is a "holly grail" of a performance sucker. :-( . Prior to C++ 17 simply doing std::vector<Value> would have been Undefined Behavior. Knowing such indirection exists, Making an array of small items would have serious performance implications. However, Digging into the common implementations of loosely typed languages (JavaScript and Python) where objects of various types can be mixed in a single array, there are unavoidable performance implications.

As for reading strongly typed UBJSON array into C++, that wouldn't be much of a performance concern, however, for our Value class, it would incur the extra cost of more "types" to check for at run-time. Something we can probably live with, and its much better than the overhead of having each array item allocated by new.

I will need to significantly improve the library to store strongly typed arrays efficiently. Thank you for bringing this issue up.