toml-lang / toml

Tom's Obvious, Minimal Language
https://toml.io
MIT License
19.43k stars 846 forks source link

Anonymous hashes #50

Closed richo closed 10 years ago

richo commented 11 years ago

Unless I missed something, there's no way to have an arbitrary sized array of hashes.

mojombo commented 11 years ago

That's true. What's your use case in a configuration file sense?

richo commented 11 years ago

Any time you're matching over something, for example if you chose to store hubot triggers in a hash, the data structure that TOML forces winds up looking like:

{
    nameOfThisThing: // Object name
    {
        name: "nameOfThisThing", // pointlessly repeated to make passing this object around sane
        pattern: "/foo bar/",
    },
    someOtherThing:
    {
        name: "someOtherThing",
        pattern: "/beep boop/"
    }
}

You almost certainly want to put the name of each set of values inside the child hash so you can just pass it around as a well encapsulated object, but you're forced to give it a name for the upstream structure.

DouweM commented 11 years ago

Suggested format:

admins = [
  username = "mojombo"
  password = "githubftw"
  email    = "tom@github.com"
,
  username = "DouweM"
  password = "unguessable"
]

The hashes can be indented however you like, as can the comma be, just as with existing arrays. There has to be a newline before and after each hash though, to preserve their multiline structure. This also means there are newlines before and after the commas.

In line with arrays being able to contain arrays of different data types, I think the hashes don't need to have the same keys per se; you could imagine a key being optional. It would also make implementation easier because the hash's keys don't have to be checked.

Equivalent JSON:

{
  "admins": [
    {
      "username": "mojombo",
      "password": "githubftw",
      "email":    "tom@github.com"
    },
    {
      "username": "DouweM",
      "password": "unguessable"
    }
  ]
}
DouweM commented 11 years ago

@Hers points out in #82 that my proposal doesn't support nested hashes, hmm. Any other ideas? I really don't like the syntax in his proposal.

HersonHN commented 11 years ago

I sent a pull request for handling arrays whit hashes in the definition.

So you can define deep nester attributes like

[wooga.zooga[0].booga]
mooga = "chooga"

[wooga.zooga[1]]
mooga = "tooga"

[wooga.zooga[1].rooga]
dooga = "fooga"

will produce in JSON

{
    "wooga": {
        "zooga": [
            {
                "booga": {
                    "mooga": "chooga"
                } 
            },
            {
                "mooga": "tooga",
                "rooga": {
                    "dooga": "fooga"
                }
            }
        ]
    }
}
DouweM commented 11 years ago

Or maybe my proposal does support nested hashes. If we interpret the array's contents just like we do the whole document, key/value pairs outside of a keygroup as well as keygroups are relative to the array item:

admins = [
  username = "mojombo"
  password = "githubftw"
  email    = "tom@github.com"

  [privileges]
  crud = true
,
  username = "DouweM"
  password = "unguessable"

  [privileges]
  crud = false
]

JSON equivalent:

{
  "admins": [
    {
      "username":   "mojombo",
      "password":   "githubftw",
      "email":      "tom@github.com",

      "privileges": {
        "crud": true
      }
    },
    {
      "username":   "DouweM",
      "password":   "unguessable",

      "privileges": {
        "crud": false
      }
    }
  ]
}
HersonHN commented 11 years ago

well, only I can say is that all we are familiar with the "foo[0].bar" definitions and in my proposal you don't need to use indentation to make it more readable. But well, nested brackets in definitions looks ugly.

DouweM commented 11 years ago

Yeah, nothing personal at all, I just don't like the nested brackets. I agree that my proposal is only easily readable with indentation, but I don't think that's a major issue. TOML doesn't itself prescribe indentation, but people can and will most likely use it.

HersonHN commented 11 years ago

we can also replace brackets whit parentheses or remove them

[wooga.zooga(0).booga]
mooga = "chooga"

[wooga.zooga(1)]
mooga = "tooga"

[wooga.zooga(1).rooga]
dooga = "fooga"

[wooga.zooga.0.booga]
mooga = "chooga"

[wooga.zooga.1]
mooga = "tooga"

[wooga.zooga.1.rooga]
dooga = "fooga"
DouweM commented 11 years ago

That's somewhat better as far as readability goes, but the syntax is less recognizable.

I just noticed something else I don't really like about your proposal: the index is hardcoded within the key, which it makes it hard to move, remove or add items later on. There's not really a solution to that with your proposal, as far as I can see. With my proposal that Just Works™, because the array items are treated as little TOML documents in-and-of-themselves.

richo commented 11 years ago
[wooga.zooga]
foo = "bar"

[wooga.zooga.foo]
rawr = "test"

So which is it? Having a declarative data format that lets you specify two values for a key is very bad juju. It's not a tremendous problem till you're emitting it from another process.

ambv commented 11 years ago

For what it's worth I find the pointless repetition pointed out by @richo in https://github.com/mojombo/toml/issues/50#issuecomment-14006701 (link points at the comment) far less problematic than the idea of nested hashes in a minimal language.

-1 on the whole idea.

DouweM commented 11 years ago

The problem with @richo's example at https://github.com/mojombo/toml/issues/50#issuecomment-14006701 isn't just pointless repetition, it also means there's no way to iterate over the pseudo-array items in the order they're specified (which may be important), because hashes don't preserve order.

ambv commented 11 years ago

@DouweM That might bite you anyway on different occasions (like writing the configuration file back to disk). Either the format requires ordered hashes or you have to specify the order as an array in a separate option.

Both syntax proposals for arrays of hashes strike me as inconsistent with the rest of this otherwise simple (or should I say, "obvious") format.

liuggio commented 11 years ago

+1 for remove parenthesis, minimal language . is enough.

DouweM commented 11 years ago

@ambv I can see where you're coming from, but I don't think "you can nest hashes in arrays by effectively embedding a TOML document inside the array" clashes with the simple/obviousness of the format.

Ghoughpteighbteau commented 11 years ago

I was trying to translate a pom.xml when I hit this section:

<dependency>
  <groupId>com.google.api-client</groupId>
  <artifactId>google-api-client</artifactId>
  <version>1.13.2-beta</version>
</dependency>

<dependency>
  <groupId>com.google.api-client</groupId>
  <artifactId>google-api-client-servlet</artifactId>
  <version>1.13.1-beta</version>
</dependency>   

How would I represent this in TOML?

[dependancy1]
groupId    = "com.google.api-client"
artifactId = "google-api-client"
version    = "1.13.2-beta"

[dependancy2]
groupId    = "com.google.api-client"
artifactId = "google-api-client-servlet"
version    = "1.13.1-beta"

That's not right, it clearly should be an array, but I don't think the standard supports it. At best I would think you'd have to use parallel arrays

[dependencies]
groupIds    = ["com.google.api-client", "com.google.api-client"]
artifactIds = ["google-api-client"    , "google-api-client-servlet"]
versions    = ["1.13.2-beta"          , "1.13.1-beta"]

and that's just not pretty.

I disagree with

[dependencies(0)]
blah="blah"

[dependencies(1)]
blah="blu-de-blah"

because now you can't insert in the middle without editing everything under it or putting it out of order.

[dependencies(0)]
blah="blah"

[dependencies(2?)]
blah="blah"

[dependencies(1)]
blah="blu-de-blah"

I just can't see a way around this issue without inflating the spec, or turning it into JSON with comments and less brackets.

DouweM commented 11 years ago

You're right, the standard doesn't as of yet support this. My proposal:

dependencies = [
  groupId    = "com.google.api-client"
  artifactId = "google-api-client"
  version    = "1.13.2-beta"
,
  groupId    = "com.google.api-client"
  artifactId = "google-api-client-servlet"
  version    = "1.13.1-beta"
]
Ghoughpteighbteau commented 11 years ago

Yah, your proposal definitely solves the issue, but I kind of agree with Hers. nested brackets cause problems

How about this:

[dependency[]]
  groupId    = "com.google.api-client"
  artifactId = "google-api-client"
  version    = "1.13.2-beta"
  package[] = "Array appending behavior for all"
  package[] = "and to all a good night"
  # Equivalent to: pacakge = ["Array appending behavior for all", "and to all a good night"]

[dependency[]]
  groupId    = "com.google.api-client"
  artifactId = "google-api-client-servlet"
  version    = "1.13.1-beta"

keep in mind, I kinda think this is ugly as well, but without some symbol to explicitly define an array then accidental collisions become arrays unexpectedly. Yuck.

DouweM commented 11 years ago

I prefer my proposal since it stays with the array syntax used for other types, but I would be fine with that!

tal commented 11 years ago

I think this is a very important aspect of the language. And definitely like something along the lines of Ghoughpteighbteau's thought.

An alternate way around the uglyness of [] would be to do [dependency.#]

But the problem with having it in the header at all is that you have an issue of ordering, how would you deal with:

[app]
name = "Foo"

[dependancy.#]
groupId    = "com.google.api-client"
artifactId = "google-api-client"
version    = "1.13.2-beta"

[address]
hostname = "foo.bar.com"

[dependancy.#]
groupId    = "com.google.api-client"
artifactId = "google-api-client-servlet"
version    = "1.13.1-beta"

Perhaps stealing from the dreaded YAML?

[dependencies.#]
  [-]
  groupId    = "com.google.api-client"
  artifactId = "google-api-client"
  version    = "1.13.2-beta"
  [-]
  groupId    = "com.google.api-client"
  artifactId = "google-api-client-servlet"
  version    = "1.13.1-beta"

[something.else]
foo = "Bar"

[dependancy.#] to indicate that dependency is an array of anonymous hashes

Indented [-] for each anonymous hash within the array.

HersonHN commented 11 years ago

I like the [dependancy.#] signal, an easy reference to the next index of "dependancy" :+1:

jnicklas commented 11 years ago

How about taking a slice from Rails' query parsing:

[dependency][]
  groupId    = "com.google.api-client"
  artifactId = "google-api-client"
  version    = "1.13.2-beta"
  package[] = "Array appending behavior for all"
  package[] = "and to all a good night"
  # Equivalent to: pacakge = ["Array appending behavior for all", "and to all a good night"]

[dependency][]
  groupId    = "com.google.api-client"
  artifactId = "google-api-client-servlet"
  version    = "1.13.1-beta"
DouweM commented 11 years ago

@Ghoughpteighbteau, @tal, @jnicklas With your proposals, how would you define nested hashes inside an array? This is something that Rails' query parsing doesn't support either without resolving to hardcoded array indices.

Ghoughpteighbteau commented 11 years ago

Dunno. Even worse, this is a second way to define an array. It violates TOML's second objective.

"There should only be one way to do anything."

Given that criteria, there really is only one answer: @DouweM's

DouweM commented 11 years ago

Well, there are also two ways to define a key-value pair. If value is a hash: keygroup, otherwise: simply key = value. The same would be the case for arrays: if values are hashes: keygroup-like, otherwise: simply key = [ value, value, ...]

88Alex commented 11 years ago

[something.#] can lead to confusing circumstances, since # is also used for comments. Perhaps we could use another symbol... I personally like the [-] idea, or something else like it. Also, we need some way to avoid having to repeat all the key names.

To put it all together: Proposal 1.

[array.#] # For now
  [-]
  foo = "bar"
  [-]
  foo = "whatever"
  # ...
...

What I really don't like is that you have to repeat all the key names. A simple change to one of them will require a lot of reworking.

Proposal 2.

[array.#]
  (foo bar)
  [-]
  "something"
  "something else"
  [-]
  "whatever"
  "something else like that"
  # ...
...

This fixes the repeating-all-the-key-names thing. However, with longer arrays, you could forget what the values refer to. Also, this idea would be harder to parse.

mojombo commented 10 years ago

Solved by #153.