kislyuk / yq

Command-line YAML, XML, TOML processor - jq wrapper for YAML/XML/TOML documents
https://kislyuk.github.io/yq/
Apache License 2.0
2.53k stars 81 forks source link

Add indentation & improve sorting of TOML output #164

Closed oleole39 closed 1 year ago

oleole39 commented 1 year ago

Hello,

Thank you for sharing and maintaining this tool.

Would you consider adding an option (either default behavior or by specifying a parameter) to indent toml output in a way similar to https://github.com/kislyuk/yq/issues/52, and improve the way toml sorts the keys ?

Currently, with yq/tomlq 3.1.1, if my toml file is as follows:

[cat1]
prop1 = false

    [cat1.sucat]
    prop1 = false

        [cat1.subcat.subsubcat]
        prop1 = true
        prop2 = "hello"

[cat2]
prop1 = false

Running cat my-example.toml | tomlq - t outputs:

[cat1]
prop1 = false

[cat2]
prop1 = false

[cat1.sucat]
prop1 = false

[cat1.subcat.subsubcat]
prop1 = true
prop2 = "hello"

Somehow tomlq decided to sort keys differently and did not leave keys and subkeys following each other (sounding a bit counterintuitive to me). I also personally find the lack of indentation less readable, and that does not match the "official" formatting used in the example displayed on toml language's repo's readme:

# This is a TOML document.

title = "TOML Example"

[owner]
name = "Tom Preston-Werner"
dob = 1979-05-27T07:32:00-08:00 # First class dates

[database]
server = "192.168.1.1"
ports = [ 8000, 8001, 8002 ]
connection_max = 5000
enabled = true

[servers]

  # Indentation (tabs and/or spaces) is allowed but not required
  [servers.alpha]
  ip = "10.0.0.1"
  dc = "eqdc10"

  [servers.beta]
  ip = "10.0.0.2"
  dc = "eqdc10"

[clients]
data = [ ["gamma", "delta"], [1, 2] ]

# Line breaks are OK when inside arrays
hosts = [
  "alpha",
  "omega"

Running cat official-example.toml | tomlq - t outputs:

title = "TOML Example"

[owner]
name = "Tom Preston-Werner"
dob = "1979-05-27T07:32:00-08:00"

[database]
server = "192.168.1.1"
ports = [ 8000, 8001, 8002,]
connection_max = 5000
enabled = true

[clients]
data = [ [ "gamma", "delta",], [ 1, 2,],]
hosts = [ "alpha", "omeg",]

[servers.alpha]
ip = "10.0.0.1"
dc = "eqdc10"

[servers.beta]
ip = "10.0.0.2"
dc = "eqdc10"

We can see tomlq deleted comments (maybe this would be more complicated to avoid?) and removed [servers] category header probably because it considered it was an empty key, disregarding its subkeys.

Best regards, oleole39

kislyuk commented 1 year ago

Hi - thanks for your suggestion. You have to keep in mind that yq serializes the state of the document to JSON (preserving key order), and uses the toml library to convert it back to TOML. I don't believe the toml library exposes an API for customizing whitespace emitted by the serializer, so I think we have no ability to preserve non-significant whitespace (indentation), and are subject to the toml library's behavior when transforming JSON to TOML.

Even if the API existed for implementing this indentation behavior, I've never seen TOML documents indented the way you indicate. In the case of YAML indentless lists, that behavior is very widespread so implementing it made a lot of sense. To accept a PR to add section indentation behavior in TOML, I'd need to see evidence that this is likewise a widespread practice. If it is indeed widespread and the API exists, then sure, we can add an option.

It is possible that in the future we will switch to the tomllib or tomli-w library for serializing TOML, when those solutions mature a little bit more. This will resolve your other request, to the extent it generalizes (it will follow the section ordering that you prefer in your example).

oleole39 commented 1 year ago

Hello,

Thanks for the reply.

I am new to TOML, and don't have many use case in mind, but mostly 2:

Following your answer I was looking for a reference to that TOML library and landed on the way on TOML repo's wiki where are listed several converters. I tried yj, of which the last release does work almost as I would expect (running cat official-example.toml | ./yj-linux-amd64 -tti) - it still removes some of the "unnecessary" line-breaks I'd rather keep for readability purpose and delete comments, but apart from that it correctly reproduces the expected key order and indentation scheme. Last release's changelog state they implemented a specific toml library and added a new approach for key sorting. Would this approach be compatible with yq/tomlq?

kislyuk commented 1 year ago

OK, I have released yq v3.2.0 which uses tomlkit instead of toml. This matches your preferred section ordering in your example.

The library you linked is not compatible with yq. It is written in Go, and yq is written in Python.

oleole39 commented 1 year ago

Thank you for addressing that issue that fast. 3.2.0 indeed improves the output of cat official-example.toml | tomlq -t which becomes now:

title = "TOML Example"

[owner]
name = "Tom Preston-Werner"
dob = "1979-05-27T07:32:00-08:00"

[database]
server = "192.168.1.1"
ports = [8000, 8001, 8002]
connection_max = 5000
enabled = true

[servers]
[servers.alpha]
ip = "10.0.0.1"
dc = "eqdc10"

[servers.beta]
ip = "10.0.0.2"
dc = "eqdc10"

[clients]
data = [["gamma", "delta"], [1, 2]]
hosts = ["alpha", "omega"]

So it does not remove headers anymore. Great! However it stills remove indent without providing any option to add some and keeps removing comments as well.

Sorry for not replying earlier even though I had read your reply back in April. I have actually been hoping to propose a pull request to add the indent option, but could not spend enough time on it so far. Also, as in the meantime the project I was using tomlq for knew a major evolution making the script using tomlq not required anymore, I am now unsure whether I will ever continue working on this pull request... Better leave this issue as closed.

Best regards and thanks again for your work.