stephenberry / glaze

Extremely fast, in memory, JSON and interface library for modern C++
MIT License
1.21k stars 120 forks source link

Raw binary support #649

Closed stephenberry closed 10 months ago

stephenberry commented 10 months ago

Glaze supports BEVE for JSON like binary that can be used for storage and inspection. However, I also want to add support for raw binary without tags. This way we can implement the fastest messaging possible across compilers with matching ABI. I am all for breaking C++ ABI with new compiler versions (though this rarely happens), but for the same compiler version in performance critical applications raw binary is the fastest we can get.

Currently Glaze using the naming read_binary and write_binary for BEVE. I think this should change to read_beve and write_beve, and elsewhere the naming should change. Old naming should be deprecated and the new naming should be used for a significant amount of time before we introduce raw binary support.

Raw binary naming should use read_binary and write_binary. But, only do this after they have been deprecated for a significant amount of time.

stephenberry commented 10 months ago

After giving this some more thought, I've realized that really the only thing we would want different than a normal BEVE implementation is to drop tags (keys) for structs. If we use BEVE arrays for structs then we reduce memory consumption and have an excellent binary format that would match most raw implementations, and we don't need another specification. We can still be 100% BEVE conformant and just not write out the keys.

All we need is a compile time option in glz::opts such as write_structs_as_arrays. This can also be applied to JSON. But, now binary will work across compilers and operating systems, and has all the benefits of using BEVE without paying for tags.

kalradivyanshu commented 10 months ago

That sounds great! How far is this implementation? If you can guide me, I can work on a PR?

stephenberry commented 10 months ago

@kalradivyanshu, thanks for offering to make a PR. In glz::opts a boolean write_structs_as_arrays needs to be added. We then need structs that satisfy the concepts of reflectable or glaze_object_t to behave similarly to how std::tuple is serialized. A minor challenge is that the concepts as not associated with glz::opts, so I'm not sure the best way of avoiding code duplication. For reflectable types we can call to_tuple and then serialize this tuple, that should be really straightforward. For glaze_object_t we'll probably need a bit more code. If you want to try to accomplish this and submit a pull request, go for it. I should also be able to work on this soon, and it should't be too much of an effort.

stephenberry commented 10 months ago

I'm working on this now, so I'll let you know if I can use some help, and I'll push a branch soon.

stephenberry commented 10 months ago

I'm trying to figure out what to name these functions for writing/reading untagged binary. read_binary_untagged? read_binary_flat?

stephenberry commented 10 months ago

I'll add the note that glaze already has a glz::array, so using this the structs_as_arrays option with glz::object is unnecessary, but I think I'll support it anyway. glz::array should compile a bit faster. But, by supporting this option with binary it allows JSON to include keys and binary to write out without keys.

kalradivyanshu commented 10 months ago

Yeah, I understood like 40% of it, and thats being generous lol. I am fairly new to advance C++, right now I wrote what you are writing for my work but by using https://github.com/veselink1/refl-cpp (we do a lot of network things, and are bound by ethernet's 1500bytes limit, so tags are a no no). Excited to delete all that macro code and move to glaze entirely!

Do let me know if you need any specific help.

I'm trying to figure out what to name these functions for writing/reading untagged binary.

read_binary_untagged and write_binary_untagged is a lot better than flat, since flat in my mind means removing depth, I know thats how the binary data will be saved, but untagged is better name I think.

stephenberry commented 10 months ago

I've merged in #671, which adds the option structs_as_arrays and adds the helper functions read_binary_untagged and write_binary_untagged. Unit tests have been included as well.

@kalradivyanshu, let me know if you run into any issues with this new feature.

kalradivyanshu commented 10 months ago

Woah! You move fast, will check it out!

kalradivyanshu commented 10 months ago

I've merged in #671, which adds the option structs_as_arrays and adds the helper functions read_binary_untagged and write_binary_untagged. Unit tests have been included as well.

@kalradivyanshu, let me know if you run into any issues with this new feature.

Hey @stephenberry added few thoughts at: https://github.com/stephenberry/glaze/issues/687, thanks!