Closed Aegonek closed 2 years ago
Serde deserializer by default forwards visit_u8/16/32
to visit_u64
. So weekday's (and month's) visitors visit_u8
never gets called. One workaround could be changing visitor type to u64, but I'm wondering if there is more elegant way to override default forward, since we still serialized u8 value.
Here you can lookup what Deserialize and Visitor traits are about: https://serde.rs/impl-deserialize.html
Hey, I had the issue figured out last week but no time to write. Using the example of Weekday to explain, but the implementation is faulty for other types too. Example code at the end. Marked interesting places with ^1 and ^2.
Issues:
deserialize_u8
is only a hint about a type for a deserializer. Deserializer is free to ignore the hint and deserialize value as different type. JSON deserializer specifically decodes all numbers as i64. Proposed solution: none, none-issue, just don't rely on this behavior.
visit_u8
is only invoked if deserializer handles u8 specifically, which is kinda error-prone. Default behaviour for visit_num
methods is:
Proposed solution: remove visit_u8
implementation, add visit_i64
and visit_u64
implementation instead for all types that implement Deserialize.
serde_test
only checks if we can get, for example, Monday
for Token::U8(1)
. This fails because we may not necessarily have Token::U8
after deserializing for reasons above.Proposed solution: add a test for every type that serializes and deserializes value from JSON string. Something like:
#[test]
fn weekday_json() -> Result<(), Box<dyn error::Error>> {
use Weekday::*;
let mut buf: Vec<u8> = Vec::new();
let mut ser = serde_json::Serializer::new(&mut buf);
let ser = (&mut ser).compact();
Monday.serialize(ser)?;
assert_eq!(String::from_utf8(buf)?, "1");
let mut de = serde_json::Deserializer::from_str("1");
let de = (&mut de).compact();
let monday = Weekday::deserialize(de)?;
assert_eq!(monday, Monday);
Ok(())
}
I could probably generate tests like this by macro for less typing.
serde
format we can deserialize value from string and not from number. After fixing the issue with deserialization we will be able to deserialize value from both string and a number regardless of chosen serde
format. I think the expected behaviour is that we can, for example, deserialize Monday only from 1
for features = ["serde"]
and only from "Monday"
for features = ["serde-well-known"]
? This will, however break user code if someone was relying on deserializing Weekday
from string when using features = ["serde"]
. I think that it's quite easy error to make.Proposed solution: allow deserializing from both formats on both configurations. I don't think it hurts anyone.
@jhpratt - let me know what you think about it, I think I'll do some other stuff when waiting for answer.
impl<'a> Deserialize<'a> for Weekday {
fn deserialize<D: Deserializer<'a>>(deserializer: D) -> Result<Self, D::Error> {
if cfg!(feature = "serde-human-readable") && deserializer.is_human_readable() {
deserializer.deserialize_any(Visitor::<Self>(PhantomData))
} else {
// ^1
deserializer.deserialize_u8(Visitor::<Self>(PhantomData))
}
}
impl<'a> de::Visitor<'a> for Visitor<Weekday> {
type Value = Weekday;
fn expecting(&self, formatter: &mut fmt::Formatter<'_>) -> fmt::Result {
formatter.write_str("a `Weekday`")
}
fn visit_str<E: de::Error>(self, value: &str) -> Result<Weekday, E> {
match value {
"Monday" => Ok(Weekday::Monday),
"Tuesday" => Ok(Weekday::Tuesday),
"Wednesday" => Ok(Weekday::Wednesday),
"Thursday" => Ok(Weekday::Thursday),
"Friday" => Ok(Weekday::Friday),
"Saturday" => Ok(Weekday::Saturday),
"Sunday" => Ok(Weekday::Sunday),
_ => Err(E::invalid_value(de::Unexpected::Str(value), &"a `Weekday`")),
}
}
// ^2
fn visit_u8<E: de::Error>(self, value: u8) -> Result<Weekday, E> {
match value {
1 => Ok(Weekday::Monday),
2 => Ok(Weekday::Tuesday),
3 => Ok(Weekday::Wednesday),
4 => Ok(Weekday::Thursday),
5 => Ok(Weekday::Friday),
6 => Ok(Weekday::Saturday),
7 => Ok(Weekday::Sunday),
_ => Err(E::invalid_value(
de::Unexpected::Unsigned(value.into()),
&"a value in the range 1..=7",
)),
}
}
}
Proposed solution: remove
visit_u8
implementation, addvisit_i64
andvisit_u64
implementation instead for all types that implementDeserialize
.
:+1:
add a test for every type that serializes and deserializes value from JSON string
:+1: — definitely use a macro to make things simpler.
After fixing the issue with deserialization we will be able to deserialize value from both string and a number regardless of chosen
serde
format
If you can get this to work, that would be wonderful. I've tried this before for various types to no avail.
I opened a pull request for my changes: https://github.com/time-rs/time/pull/506/files.
I think all we need to be able to deserialize value regardless of chosen serde
format is to use deserialize_any
method regardless of feature flag - but it would probably have some performance penalty. For now, you can deserialize value from any supported format on serde-well-known
and from compact format only on serde
- I think that's reasonable.
I think all we need to be able to deserialize value regardless of chosen serde format is to use deserialize_any method regardless of feature flag
This only works for self-describing formats.
Trying to deserialize a Weekday from a number (without enabling
serde-human-readable
feature) returns an errorError("invalid type: integer '1', expected a 'Weekday'", ...)
I've created a minimal binary crate for testing, using stable channel. Dependency section:
Code to reproduce the issue:
I did some debugging and it seems that this code invokes
deserializer.deserialize_u8(Visitor::<Self>(PhantomData))
as expected, but deserialization still fails. I don't know Serde internals too well, so I don't know why. I could look for a solution later this week. You can assign me to this issue, please.