g-s-k / matlab-toml

TOML implementation for MATLAB
MIT License
14 stars 7 forks source link

Nested structs are not properly encoded #3

Open egeerardyn opened 5 years ago

egeerardyn commented 5 years ago

When encoding a nested struct, the fields are not properly ordered, causing the semantics to be lost.

Minimally working example:

in = struct('a','a','b',struct('c_in_b','c_in_b'),'e','e');
nested_in = struct('in',in);

% source = in;  % one level of nesting goes fine
source = nested_in;  % the second level of nested structs will break things

encoded = toml.encode(source);
decoded = toml.decode(encoded);

assert(isequal(decoded, source), 'Encoding/Decoding should be inverse operations!');

Looking at the encoded form, this looks like

[in]
a = "a"

[in.b]
c_in_b = "c_in_b"
e = "e"

but it should have been something like:

[in]
a = "a"
e = "e"

[in.b]
c_in_b = "c_in_b"
ghost commented 5 years ago

The reason this happens is because: When you do not nest the struct in nested_in, and you encode in directly, the top level function encode.m can see each of the fields plainly, and because of that is able to reorder them so that all of the substructs appear below the key:values. This happens on and around Line 17.

When the nested_in is passed to encode instead, all it sees is the struct in, it does not see the substructs inside in. So there is only one element and therefore there is nothing to reorder. Then the struct gets passed to the parsing/encoding function repr.m and in there Line 49, there is NO reordering routine. And repr.m is called recursively... so because there is no reordering, it reads the substructs of in in the order they appear which puts e = 'e' last and therefore gives the output you see.

So the fix is to have a reordering routine in the struct switch/case portion of repr.m so that it can reorder and recursively parse the substructs.

ghost commented 5 years ago

@egeerardyn It appears I have fixed the disordered encoding buy moving the field sorting into the part of the routine that deals specifically with structs.

I now get your expected output for each of your inputs:

>> encoded = toml.encode(in)
encoded =
    'a = "a"
     e = "e"
     [b]
     c_in_b = "c_in_b"'
>> encoded = toml.encode(nested_in)
encoded =
    '[in]
     a = "a"
     e = "e"
     [in.b]
     c_in_b = "c_in_b"'

I'll be submitting this commit for pull request shortly.

RoyiAvital commented 1 year ago

Did you commit the pull request?

g-s-k commented 12 months ago

@RoyiAvital No, that user did not submit a PR (and I no longer remember who it was, they deleted their account).