serde-rs / json

Strongly typed JSON library for Rust
Apache License 2.0
4.9k stars 558 forks source link

Internally tagged enums duplicate tag when struct also has it assigned #1147

Open kossnocorp opened 4 months ago

kossnocorp commented 4 months ago

Update: I've found a StackOverflow anwer that helped me to come up with a solution. While it works, I think my proposal is still relevant. It will be great to automatically detect that structs are tagged or at least update the documentation. I'm willing to do either. Cheers!


If enums and structs all have tags assigned, serde_json ends up serializing it twice:

{"type":"post","type":"post","text":"post text"}

Here's the code:

extern crate serde;
extern crate serde_json;

use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize, Debug, Clone)]
#[serde(tag = "type")]
enum Content {
    #[serde(rename = "post")]
    Post(Post),
    #[serde(rename = "comment")]
    Comment(Comment)
}

#[derive(Serialize, Deserialize, Debug, Clone)]
#[serde(tag = "type", rename = "post")]
struct Post {
    text: String
}

#[derive(Serialize, Deserialize, Debug, Clone)]
#[serde(tag = "type", rename = "comment")]
struct Comment {
    text: String
}

fn main() {
    let post = Post { text: "post text".to_owned() };
    println!("Serialized post:             {}", serde_json::to_string(&post).unwrap());

    let post = Content::Post(post);
    println!("Serialized post via enum:    {}", serde_json::to_string(&post).unwrap());

    let post_str = r#"{"type": "post", "text": "post text"}"#;
    let post : Content = serde_json::from_str(post_str).expect("Failed to parse post");
    println!("Parsed post:                 {:?}", post);

    println!("---");

    let comment = Comment { text: "comment text".to_owned() };
    println!("Serialized comment:          {}", serde_json::to_string(&comment).unwrap());

    let comment = Content::Comment(comment);
    println!("Serialized comment via enum: {}", serde_json::to_string(&comment).unwrap());

    let comment_str = r#"{"type": "comment", "text": "comment text"}"#;
    let comment : Content = serde_json::from_str(comment_str).expect("Failed to parse comment");
    println!("Parsed comment:              {:?}", comment);
}

Output:

Serialized post:             {"type":"post","text":"post text"}
Serialized post via enum:    {"type":"post","type":"post","text":"post text"}
Parsed post:                 Post(Post { text: "post text" })
---
Serialized comment:          {"type":"comment","text":"comment text"}
Serialized comment via enum: {"type":"comment","type":"comment","text":"comment text"}
Parsed comment:              Comment(Comment { text: "comment text" })

Playground link

I expect that serde_json (or serde?) would recognize that structs have tags assigned and use them, allowing code like this:

#[derive(Serialize, Deserialize, Debug, Clone)]
enum Content {
    Post(Post),
    Comment(Comment)
}

#[derive(Serialize, Deserialize, Debug, Clone)]
#[serde(tag = "type", rename = "post")]
struct Post {
    text: String
}

#[derive(Serialize, Deserialize, Debug, Clone)]
#[serde(tag = "type", rename = "comment")]
struct Comment {
    text: String
}

However, it fails with:

thread 'main' panicked at src/main.rs:32:57:
Failed to parse post: Error("unknown variant `type`, expected `Post` or `Comment`", line: 1, column: 7)

I need to assign tags to structs, as it can be parsed or serialized as a standalone structure or part of an enum, so I must ensure type is always present. This is what I get when I don't add it:

Serialized post:             {"text":"post text"}
Serialized post via enum:    {"type":"post","text":"post text"}
Parsed post:                 Post(Post { text: "post text" })
---
Serialized comment:          {"text":"comment text"}
Serialized comment via enum: {"type":"comment","text":"comment text"}
Parsed comment:              Comment(Comment { text: "comment text" })

You can see that "Serialized post" and "Serialized comment" don't have type.

Is there a way to work around it?

Would a contribution with a fix be welcome?

I didn't look at the code, so maybe it's impossible, but I think consuming struct's tags if there's no tag assigned to the enum would be a nice addition.

Alternatively, deduplicating type can be a solution, albeit not very correct.

adamchalmers commented 3 weeks ago

Here's another playground showing the issue, with comments explaining it.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=1175677900a32540ade949251a701b31

Note that serde_json serializes JSON which it cannot deserialize. In my opinion, "to_string and from_str should always be compatible" is an important property of any derived Serialize/Deserialize impl.