Closed DUOLabs333 closed 4 months ago
If you're writing your strings in C/C++ or most any text editor then you probably have valid UTF-8, unless you are adding invisible control characters.
JSON strictly requires UTF-8, so Glaze will reject illegal strings, such as strings that contain null or control characters in the middle of them. These can be written as escaped unicode \u
, but this is typically dangerous and prone to error in C++ and other languages because it can result in hidden null characters in types like std::string
and will break a lot of C string algorithms like strnlen
We are planning to add a compile time option to automatically unicode escape invalid UTF-8. The open issue is here #812. But, this is not recommended for general use.
What is your use case for non UTF-8 strings? Are you expecting invisible control characters in your strings?
In summary, Glaze does not unicode escape invalid UTF-8 when writing to ensure performance, but Glaze does ensure that the strings written will trigger a read error by any conforming JSON parser. If any JSON library is able to parse what you are writing, then you know that you're good to go.
I'm writing a Vulkan driver in C++, and some commands/structs allow using a void pointer to hold arbitrary data. Since I'm sending the data over a network, I need to be able to serialize it.
However, now that I think about it, I probably should use std::vector
Absolutely, arbitrary data like this is best in a std::vector<uint8_t>
or std::vector<std::byte>
.
I'll note that the same goes for if you use the binary format BEVE with Glaze.
If I have a string made up of raw chars, with no effort made to escape them, will glaze be able to serialize/parse them?