toml-rs / toml

Rust TOML Parser
https://docs.rs/toml
Apache License 2.0
730 stars 107 forks source link

decor_mut allows injection of arbitrary data #267

Open epage opened 2 years ago

ordian commented 2 years ago

Addding a comment to a decor is valid for some cases, but not for keys. I don't think there is a way to solve this problem in general, other than make it clear to the user that this mutation is "unsafe".

epage commented 2 years ago

Comments are just the beginning. You could have a decor that is "\nkey = 5".

yyny commented 6 months ago

I would be willing to implement a draft PR for this as outlined in #728.

epage commented 6 months ago

I have some concerns about that proposal. Knowing what type of decor is applicable is dependent on what parent container you are in. This means we don't know when it should be an error and you can move things between locations and the Decor can switch from valid to invalid. It also adds a lot of ceremony to the API.

yyny commented 6 months ago

Knowing what type of decor is applicable is dependent on what parent container you are in.

Ah, I didn't think of that. This can be implemented with a private trait, though:

pub struct Spacing(RawString);
pub struct Newline(RawString);
pub struct Comment(RawString);

trait IntoInlineDecor: Into<RawString> {} // Private trait
trait IntoMultilineDecor: Into<RawString> {} // Private trait

impl IntoInlineDecor for Spacing {}
impl IntoMultilineDecor for Spacing {}
impl IntoMultilineDecor for Newline {}
impl IntoMultilineDecor for Comment {}

impl Value {
    fn append_prefix(impl IntoInlineDecor);
    // etc.
}

you can move things between locations and the Decor can switch from valid to invalid.

There would have to be a separate InlineDecor for decor that isn't allowed to span multiple lines, like for Keys and Values. There would then also have to be some way to add multi-line decor between the entries of (non-inline) tables and arrays, like by adding TableEntry/ArrayEntry types. Keys and Values cannot safely contain multi-line decor, since they can be moved into inline tables.

Alternatively, there could be separate InlineKey/InlineValue types, but I think being able to treat all tables the same is valuable.

It also adds a lot of ceremony to the API.

That is the price of safety.

epage commented 6 months ago

There would have to be a separate InlineDecor for decor that isn't allowed to span multiple lines, like for Keys and Values.

iirc keys can sometimes have multi-line prefixes, sometimes they can't. Same for values but for suffixes.

That is the price of safety.

Which "safety" definition :)

For ensuring users generate valid TOML files, there are other alternatives like stripping invalid decor or offering validating vs panicking functions. In a lot of cases, I expect the invalid decor is created as programming mistake rather than trying to handle arbitrary input.

yyny commented 6 months ago

iirc keys can sometimes have multi-line prefixes, sometimes they can't. Same for values but for suffixes.

They don't have to. Multi-line prefixes/suffixes for keys/values could (and should) be disallowed.

there are other alternatives like stripping invalid decor or offering validating vs panicking functions

There are indeed, and this is exactly what I (and I assume others) have been doing already, but it is clearly error-prone and could be better handled by toml_edit itself. My proposal is to do validating in toml_edit, with unsafe { ... } APIs to skip validation.

I expect the invalid decor is created as programming mistake

Same, but unsound API design is also a programming mistake.

epage commented 5 months ago

They don't have to. Multi-line prefixes/suffixes for keys/values could (and should) be disallowed.

To truly disallow this, we'd need to attach whitespace+comment lines to something else. This doesn't work well with our existing Map model.

My proposal is to do validating in toml_edit, with unsafe { ... } APIs to skip validation.

I've considered using unsafe in the past for general invariant violations and the impression I got is it was generally frowned upon.

Same, but unsound API design is also a programming mistake.

There are also many cases between these or else we wouldn't have assert, debug_assert, expect, and dev-time errors reported at runtime.