SergioBenitez / Figment

A hierarchical configuration library so con-free, it's unreal.
Apache License 2.0
629 stars 38 forks source link

serde(flatten) breaks metadata propagation in errors #80

Closed zegelin closed 1 year ago

zegelin commented 1 year ago

When using Figment with nested structs, some of which have #[serde(flatten)] attached, any error that occurs while loading the config for the nested flattened keys wont have any context info attached.

In the following examples the test.toml config is invalid -- a string is specified where a u32 is expected. In the first example flatten isn't used. In the second, the intermediate enum is flattened on the root struct.

use figment::{Figment, providers::{Toml, Format}};
use serde::Deserialize;

fn main() {
    // example without flatten
    let err = figment::Jail::try_with(|jail| {
        #[derive(Clone, Deserialize, Debug)]
        struct TcpConfig {
            timeout: u32,
        }

        #[derive(Clone, Deserialize, Debug)]
        #[serde(rename_all = "lowercase")]
        enum PortConfig {
            Tcp(TcpConfig)
        }

        #[derive(Clone, Deserialize, Debug)]
        struct Config {
            port: PortConfig,
        }

        jail.create_file("test.toml", r#"
            port.tcp.timeout = "not a valid u32"
        "#)?;

        Figment::from(Toml::file("test.toml"))
            .extract::<Config>().map(|c| ())
    }).err().expect("should fail");

    println!("{err}");

    // example with flatten
    let err = figment::Jail::try_with(|jail| {
        #[derive(Clone, Deserialize, Debug)]
        struct TcpConfig {
            timeout: u32,
        }

        #[derive(Clone, Deserialize, Debug)]
        #[serde(rename_all = "lowercase")]
        enum PortConfig {
            Tcp(TcpConfig)
        }

        #[derive(Clone, Deserialize, Debug)]
        struct Config {
            #[serde(flatten)]
            port: PortConfig,
        }

        jail.create_file("test.toml", r#"
            tcp.timeout = "not a valid u32"
        "#)?;

        Figment::from(Toml::file("test.toml"))
            .extract::<Config>().map(|c| ())
    }).err().expect("should fail");

    println!("{err}");
}

When run the output is:

invalid type: found string "not a valid u32", expected u32 for key "default.port.tcp.timeout" in /tmp/.tmpzvTfAI/test.toml TOML file
invalid type: found string "not a valid u32", expected u32

The first error message is far more useful as it explains which key is invalid and which source file.

SergioBenitez commented 1 year ago

The way that flatten works in serde is rather troubling and results in lots of issues. In short, when #[flatten] is applied to any field of a struct T, serde does the following in T's derived deserialization:

  1. It collects every field that doesn't match a non-flattened field of the struct into what is effectively a hash map of effectively untyped values. Thus, deserialization for all internal values always succeeds.
  2. It passes that map into a new deserializer, which is then passed to the deserialize implementations for each flattened field type, which performs the true deserialization on that map.
  3. Because the internal keys and values have already "successfully deserialized", any errors from the true deserialization are propagated as if they occurred on the top-level value, i.e, the base value.

This means that figment sees that an error occurred for the entire structure, not any individual key/value pair, and so it has no idea what to assign the error to.

To concretize this, let's take a simplified version of your example:

#[derive(Clone, Deserialize, Debug)]
struct TcpConfig {
    timeout: u32,
}

#[derive(Clone, Deserialize, Debug)]
struct Config {
    #[serde(flatten)]
    port: TcpConfig,
}

For the TOML file timeout = "hi", we have a key of timeout assigned to a string value of "hi". Normally, because timeout is a u32, the deserializer would fail while deserializing the string "hi", giving us the error we want.

In the case of flattened structs, however, the internal "flattening" deserializer accepts everything. Thus the string "hi" deserializes just fine, and no error occurs on that value. Instead, deserialization fails later, after all values have deserialized properly and the only thing remaining is the overall structure, i.e, the complete figment. Here the Config deserializer raises the original error, but the figment deserializer has no idea what to apply it to.

This might be a bit confusing - I apologize. The details are rather gnarly. The short of it is: flattening breaks what it means to deserialize a structure by buffering values and lying that deserialization succeeded, depriving the true deserializer of the information about when errors occur. This means we can't emit a proper error during flattening.

Perhaps there's some work-around, but I'll have to think about it more.

SergioBenitez commented 1 year ago

See also https://github.com/serde-rs/serde/issues/2186#issue-1163413259.

zegelin commented 1 year ago

Thank you for such a detailed reply! I didn't think to look into the details of how flatten works on the serde side of things. Your explanation of whats happening under the hood and why that becomes problematic for figment to provide details makes perfect sense.

I was only using flatten try and "clean up" my config file by removing seemingly redundant keys (ie, have tcp or serial on the root rather than port.tcp or port.serial) and to abstract a few common attributes (such as timeout) across various port types (tcp, serial, etc) into a common struct. For the sake of anyone using my program I'll choose to forgo flatten in favor of better error messages!

Perhaps the figment docs could have a mention of this "gotcha" under the "tips" section?

SergioBenitez commented 1 year ago

Perhaps the figment docs could have a mention of this "gotcha" under the "tips" section?

Good idea!