toml-rs / toml-rs

A TOML encoding/decoding library for Rust
https://docs.rs/toml
Apache License 2.0
1.02k stars 189 forks source link

`to_string` changes ordering #386

Closed kgv closed 2 years ago

kgv commented 4 years ago

Why to_string changes ordering? Is this expected behavior? (preserve_order feature enable)

    assert_eq!(
        Table(map! {
            "test0" => Table(map! {}),
            "test1" => Array(vec![Integer(2)])
        })
        .to_string(),
        "test1 = [2]\n\n\
         [test0]\n"
    );
    // this throws error:
    // assert_eq!(
    //     Table(map! {
    //         "test0" => Table(map! {}),
    //         "test1" => Array(vec![Integer(2)])
    //     })
    //     .to_string(),
    //     "[test0]\n\
    //      test1 = [2]\n"
    // );
dtolnay commented 4 years ago

I think this is the correct behavior.

[test0]
test1 = [2]

^ this means something different than the Table you wrote. In json notation that would be {"test0":{"test1":[2]}} instead of {"test0":{},"test1":[2]}.

kgv commented 4 years ago

Oh, wrong example

This is equal toml:

one:

[[test1]]
key = 2

[test0]
key = 1

two:

[test0]
key = 1

[[test1]]
key = 2

But it is reorder this anyway:

    assert_eq!(
        Table(map! {
            "test0" => Table(map! {
                "key" => Integer(1)
            }),
            "test1" => Array(vec![Table(map! {
                "key" => Integer(2)
            })])
        })
        .to_string(),
        "[[test1]]\n\
         key = 2\n\
         \n\
         [test0]\n\
         key = 1\n"
    );
ehuss commented 4 years ago

Hm, for some reason the Value serializer will serialize all array-of-tables before normal tables (src). I can't quite think of a reason for that, as I think it should be safe to interleave normal tables and array-of-tables.

alexcrichton commented 4 years ago

I believe the reason for this is the same as the first comment, arrays of values have to be serialized first and the serializer isn't discerning enough to distinguish between arrays of values and arrays of tables.

ehuss commented 4 years ago

It does appear like it is checking for the difference between an array and an array of tables, and does it in 3 stages (values first, array of tables, then normal tables). I think the last two stages can be combined, maybe something like this:

diff --git a/src/value.rs b/src/value.rs
index 38dfe1f..68beca3 100644
--- a/src/value.rs
+++ b/src/value.rs
@@ -211,6 +211,14 @@ impl Value {
         self.as_table().is_some()
     }

+    fn is_table_or_array_of_tables(&self) -> bool {
+        self.is_table()
+            || self
+                .as_array()
+                .map(|a| a.iter().any(|v| v.is_table()))
+                .unwrap_or(false)
+    }
+
     /// Tests whether this and another value have the same type.
     pub fn same_type(&self, other: &Value) -> bool {
         discriminant(self) == discriminant(other)
@@ -414,25 +422,12 @@ impl ser::Serialize for Value {
                 // Be sure to visit non-tables first (and also non
                 // array-of-tables) as all keys must be emitted first.
                 for (k, v) in t {
-                    if !v.is_table() && !v.is_array()
-                        || (v
-                            .as_array()
-                            .map(|a| !a.iter().any(|v| v.is_table()))
-                            .unwrap_or(false))
-                    {
-                        map.serialize_entry(k, v)?;
-                    }
-                }
-                for (k, v) in t {
-                    if v.as_array()
-                        .map(|a| a.iter().any(|v| v.is_table()))
-                        .unwrap_or(false)
-                    {
+                    if !v.is_table_or_array_of_tables() {
                         map.serialize_entry(k, v)?;
                     }
                 }
                 for (k, v) in t {
-                    if v.is_table() {
+                    if v.is_table_or_array_of_tables() {
                         map.serialize_entry(k, v)?;
                     }
                 }
hbina commented 4 years ago

According to the spec[1]

The last type that has not yet been expressed is an array of tables. These can be expressed by using a table name in double brackets. Under that, and until the next table or EOF are the key/values of that table. Each table with the same double bracketed name will be an element in the array of tables. The tables are inserted in the order encountered. A double bracketed table without any key/value pairs will be considered an empty table.

Meaning that tables should always be the very last thing that any parser should emit if it wants the operation to be reversible.

[1] https://toml.io/en/v1.0.0-rc.1#array-of-tables

epage commented 2 years ago

Maintenance of this crate has moved to the https://github.com/toml-rs/toml repo. As a heads up, we plan to move toml to be on top of toml_edit, see https://github.com/toml-rs/toml/issues/340.

Closing this out. If this is still a problem, feel free to recreate this issue in the new repo.