Closed sammhicks closed 3 months ago
Pending a doc update here, this all looks good to me! Sorry for the sporadic reviews, I've been quite busy recently.
Any chance you could publish a release so that I can integrate JSON deserialization into picoserve
?
Indeed, I'll publish a release as soon as https://github.com/rust-embedded-community/serde-json-core/pull/91 gets reviewed. I also realized that this only handles de-serializing escaped strings, but we still don't have any support for serializing them with escaping. I'll add an issue for this. Thanks for the work on this!
I'm unsure about the rationale behind this (very big) change. Why is this approach preferrable over the one in serde_json
, where it supports zero-copy no-escape deserialization of &str
and escape-capable deserialization of owned String
? Why not let a target heapless::String
be the "provided serialization buffer". It certainly can't be better in terms of lifetimes as they need to unify anyway.
Why is this approach preferrable over the one in
serde_json
, where it supports zero-copy no-escape deserialization of&str
This can be achieved using EscapedStr
, or if the source has no escape sequences. This is similar to serde_json
, which only allows borrowing from the input of there are no escape sequences.
escape-capable deserialization of owned
String
?
Deserializing a struct with a alloc::string::String
or heapless::String
will involve their visitors being called with visit_str
if the string is escaped or visit_borrowed_str
, if it's not, which will cause the String to take ownership.
Why not let a target
heapless::String
be the "provided serialization buffer"
This would involve Deserializer
having an additional generic parameter, which would be a breaking change.
This can be achieved using EscapedStr, or if the source has no escape sequences. This is similar to serde_json, which only allows borrowing from the input of there are no escape sequences.
Why is there a need for deviation from the serde_json
behavior?
Isn't this case (EscapedStr
) where you absolutely need a &str
(as opposed to a heapless/alloc String<N>
) and need to be escape-capable a bit exotic and much beyond what serde_json
supports? (1) You end up having to maintain the scratch buffer along with the value. (2) The amount of copying/scanning looks to be the same as with a transient scratch buffer (3) You need to reason about the buffer size if there are multiple escaped strings borrowing from it.
if it's not, which will cause the String to take ownership.
Sure. That's inherent. String
(heapless
or std
) always ends up owning its data. They only implement visit_str
by copying and don's special-case visit_borrowed_str
.
This would involve Deserializer having an additional generic parameter, which would be a breaking change.
That's not what I mean. I'm proposing (1) making your change look a bit more like the scratch
Vec
in serde_json
(just support transient de-escaping: same purpose, same functionality), (2) fixing #74 by behaving like serde_json
, having deserialize_str
scan for escape sequences first and then dispatch to visit_borrowed_str
if it can be borrowed, else try un-escape using the scratch buffer and call visit_str
, else error out the same way serde_json
does.
I agree that a scratch buffer is needed (and it should be optional and borrowed, not owned, just as you implement it). I just don't get the other deviations from the serde_json
behavior. And I don't get EscapedStr
.
serde_json
is designed for systems where it's reasonable to require applications to allocate heap memory to use functionality, serde_json_core
is designed for systems where, for example, there may be enough memory to store an incoming payload, but not enough to also store buffers for storing strings.
Suppose that a user wishes to deserialize a structure containing a list of strings, some of which may include escape sequences when JSON encoded. Without EscapedStr
, the user must use something along the lines of heapless::Vec<heapless::String<32>,16>
, most of which is unused space, unless the list happens to be of length almost equal to 16, and all strings happen to be of length almost equal to 32. This structure must also exist at the same time as the buffer which the JSON encoded text was stored, and thus this code is not particularly space-efficient. If however, the data type becomes heapless::Vec<EscapedStr>
, and as EscapedStr
is a lot smaller than heapless::String<32>
, the code is much more space-efficient.
Right. This is an allocator!
The real-world cases where the maximum sum of lengths is known to be significantly less than the sum of maximum lengths such that this makes a difference are exceedingly exotic.
It doesn't justify the practical disadvantages and unergonomics of having to keep the allocator arena alive and having to reason about its size.
Here's a partial implementation of the design suggested in https://github.com/rust-embedded-community/serde-json-core/issues/79#issuecomment-2053622260_
An example usage of the new design can be seen at https://github.com/sammhicks/picoserve/tree/json-parse-demo