Closed CardboardTurkey closed 5 months ago
I have written a regression test and can confirm this bug. I shall now investigate.
Okay, this comes down to the fact that when the derive for the fields on inner are handled, the generated deserializer uses an internal implementation detail of serde called Content
which is also used for untagged enums and some other things. This approach guarantees that the deserializer in marked_yaml isn't given the type hints it needs to work out what's going on since the default for marked_yaml is not not ascribe any semantics to the values of scalars (they are always initially strings). This kind of thing also guarantees Spanned<T>
won't work inside flattened structs as well by the looks of things.
I'm not sure that there's a way around this; but I shall investigate more.
This is definitely wound up in the magical internals of serde - sadly it looks like flatten and untagged enums (I previously only knew about the latter) are entirely reliant on the data format being fully self-describing which marked-yaml is, but only in the sense of every value is a string.
I'd need to put some thought into adding opportunistic deserialisation. Looking at how the number types are handled for deserialisation inside serde, I think that if we could have opportunistic deserialisation (ie. when hitting _any for a scalar, try it as a boolean or a number first) then perhaps we could get further; though you'd still not be able to use Spanned<T>
inside flattened or untagged things.
Would that be sufficient? If so, I can try and add such a capability.
No unfortunately my particular use case requires flattened spanned fields
No unfortunately my particular use case requires flattened spanned fields
I'll have a think - it might be that your best bet is to integrate the relevant fields into the Outer struct and then write a trait to deal with multiple structs containing the same "flattened" fields
Right, basically the answer is "nope, not going to happen, sorry" - this is a fundamental disconnect/gap in the serde data model which probably would require a fancier code generation than standard derive proc macros in order to resolve.
I don't see this as possible within the current constraints. As such I reiterate - manually flatten the fields and add a trait if you need it. Sorry.
For reference: manual flattening with serde_yaml::Value
as intermediate storage instead of serde::private::de::content::Content
(since marked_yaml::Node
won’t deserialize) causes the same issue. I’ll need to investigate this issue some more to see whether it can be worked around.
@palant If you can come up with a way to do so then that'd be amazing; unfortunately I can't really find a way to annotate either of the intermediate formats with the right amount of location data to "maybe" be consumed. It'd be easier if a deserializer could specify what intermediate format to use; then we could write support to just use the nodes as the intermediate format.
@kinnison I think I see the issue. This library generally treats scalars as strings. There are methods like as_u32
but that’s merely syntactic sugar for “take that string and try to parse it as an integer.”
Serde on the other hand expects data types to be represented differently. So when serde-yaml encounters 1234
then that’s a number, and a number cannot be a string. If you want a string then you have to write "1234"
.
When Serde stores data in Content
it assumes that each value has only one type. This is inherently incompatible with how marked-yaml treats strings, numbers and booleans as interchangeable. Properly supporting the Serde interface is likely impossible without major changes to type handling in this library.
Yeah, it'd be easy if we could convince serde to treat the "intermediate" format as Node rather than one of the serde data format; but it is what it is. Oh well. Part of the point of marked-yaml is that it does NOT try and guess what type a value should be. Thanks for looking into this.
Suppose I have a struct like:
I should be able to deserialize this yaml into
Outer
:However, if I try to do so with
marked_yaml::from_yaml
I get the error:It seems that no matter what type I make
Inner::num
,marked_yaml
will always cast thenum
value to a string before complaining about type mismatch.This seems to be an issue with the flattening because
can (of course) deserialize the yaml fine.