KenKundert / nestedtext

Human readable and writable data interchange format
https://nestedtext.org
MIT License
362 stars 13 forks source link

Handling of tabs after unquoted keys #16

Closed george-hopkins closed 3 years ago

george-hopkins commented 3 years ago

While implementing a parser in Rust, I discovered that nt.loads(nt.dumps({'a\t': 'b'})) returns {'a': 'b'}. dumps() does not quote the key (which is correct as far as I understood the specification) however the tabulator (→) will be stripped when a→: b is loaded again. Is this a bug in the implementation or should tabulators after unquoted keys be removed?

KenKundert commented 3 years ago

It is not a bug. All white space after a key is ignored to allow values to be lined up without affecting the keys. For example:

key 1: value 1
k2   : value 2

becomes {'key 1': 'value 1', 'k2': 'value 2'}. Notice there is no space characters in k2. The same would be true if a tab was used rather than spaces.

There a bit of asymmetry between keys and values. For example, keys sometimes require quoting but values never do. Also trailing white space in values is retained, but not with keys. I don't know how to eliminate that asymmetry. So the current behavior is my best guess at what will result in the most natural specifications.

I noticed that in the specification of the NestedText format I mentioned that trailing spaces after keys is ignored, but I did not mention tab characters. I have changed that to include tabs.

george-hopkins commented 3 years ago

Thank you for the clarification!