ssokolow / nodo

Pre-emptively created repository so the design can be discussed on the issue tracker before commits are made (repo name may change)
Apache License 2.0
18 stars 0 forks source link

Unnecessary '\0' byte check in src/types.rs:73 #2

Closed NobodyXu closed 2 years ago

NobodyXu commented 2 years ago

Rust String does not include '\0' byte.

Here's an example of trying to create String, which silently removes the '\0' byte.

And, in linux '\0' is treated as End of String, so it would only present at the End of the string, it cannot present in the middle of a str.

ssokolow commented 2 years ago

It's the {} in the println! that's eliding the \0. If you check it with {:?} you'll see "123\u{0}234".

Rust's String and the JSON spec both being capable of representing \0 was actually a key detail in how I implemented code to round-trip Path and OsString in JSON without causing serde_json to error out on POSIX paths that contain non-UTF8-able bytes.

(TL;DR: I use \0 as an escape character. Since it's the one byte that JSON and String can represent, which will never occur in a valid Path or PathBuf, it allows me to ensure that UTF8-able paths will pass through the escaper without getting altered, meaning that it's backwards compatible with anything that could be serialized before.)

In fact, in my tests with Rust, JavaScript in Firefox, JavaScript in Chrome, Node.js, Ruby, PHP, Python, and some other languages I'm forgetting (probably Lua and Perl?), PHP was the only one that didn't round-trip a null byte in a JSON string perfectly fine when parsing and serializing JSON. (PHP just blindly assumed it was the string terminator the way C would.)

NobodyXu commented 2 years ago

Thanks, I didn't know fmt::Display for String omits the \0.