rust-lang / mdBook

Create book from markdown files. Like Gitbook but implemented in Rust
https://rust-lang.github.io/mdBook/
Mozilla Public License 2.0
17.3k stars 1.59k forks source link

Warn on unknown configuration settings #1595

Open ehuss opened 3 years ago

ehuss commented 3 years ago

When there are unknown settings in book.toml, mdbook just ignores them:

[book]
title = "foo"
abc = 123

This has occasionally caused problems and confusion for people when they make typos or mistakes. I think it would be good for unknown options to issue a warning. serde_ignored provides a facility for catching these unknown fields.

Some care will need to be taken to consider things like plugins which can add their own sections to the config.

joshrotenberg commented 3 years ago

I'm looking into this a bit.

serde_ignored works great when the top level struct (and any children) just derive Deserialize, but in our case Config has a custom deserialize implementation to, if I'm reading it correctly, handle the legacy book.toml format. It then goes on to pull the book, build and rust sections out and explicitly deserialize them with try_into and then finally builds the Config and tacks on the rest for anything left over.

Here are some options:

  1. If handling the legacy format is the only reason for the custom deserialization, a hard deprecation on that format would otherwise simplify the Config deserialization quite a bit. I haven't been around long enough to know what the ramifications of this would be. Looking briefly at the blame for config.rs, it looks like some of the legacy handling code was added 4 years ago.

  2. Pull the legacy format detection out of the deserialize somehow, try it first in from_str and if_not_legacy or whatever, do normal derive(Deserialize).

  3. Implement the serde_ignored method for the three known configs within Confg's deserialize call individually.

  4. Make unknown fields an error in the three built in configs by adding [serde(deny_unknown_fields)] to them and skip serde_ignored altogether.

Some care will need to be taken to consider things like plugins which can add their own sections to the config.

This makes me think the last option (also the easiest) makes sense since we have to have a rest field in this case anyway, and we have no way of validating other plugin's config aside from highly suggesting the implementor follow a similar pattern.

I'm happy to code up any (or all, for comparison) of the above scenarios if that would help.

ehuss commented 3 years ago

I think removing the legacy support would be fine, though that would need to be deferred to 0.5.

I don't think we can change them to hard errors as that would be a breaking change, so I think that would need to be deferred to 0.5, too. I'm not sure if it makes sense to keep them as warnings or not long term, though.

I haven't looked closely, so I'm not sure which would be the best approach. One option is in Config::from_str to deserialize to a toml::Value, and then use serde_ignored to translate the 3 fields (instead of putting that logic in the deserialize impl). That way it can somehow capture the warnings. Another option is to add a warnings field to Config, and then just add the warnings in the Deserialize impl. Whichever is easier, and doesn't break backwards compatibility seems fine to me.

joshrotenberg commented 3 years ago

Ok, spent more time with this and I'm pretty sure my 1. and 2. options are invalid because I wasn't accounting for how we do rest. Doing that without the help of the custom deserialize seems harder.

I think my 3. and your first suggestion are more on the right track, but I'm still trying to work out the best way to do it. We could:

  1. Move the legacy check out of there and first in Config::from_str, then
  2. Build book, build and rust individually with serde_ignore and report any unknown fields with a warn! for each but, that does mean passing in the full str of the config for each one and then checking the unknown fields only for that specific config (maybe not a huge deal but probably less efficient), then
  3. Let the deserialize just handle the rest case, and finally
  4. Build the Config with all the pieces and return.

I think this works. It should handle the legacy format the same way, only report unknown fields for the three built in sections, and still tack on the rest for anything downstream. I'm going to work it up and see how it looks.

KFearsoff commented 7 months ago

Any updates on this? I've stumbled upon the code for config file and it's... A bit hard to modify. Would definitely appreciate removal of manual derive implementation, and including some more well-known config blocks into the top-level Config struct would definitely help as well.