INISON / inison

INISON -- Simple readable configuration and serialization format.
MIT License
0 stars 0 forks source link

Arrays of table/dictionary/section headers #8

Open ghost opened 8 years ago

ghost commented 8 years ago

How to improve it?

ghost commented 8 years ago

What about [array#], [array#.subarray#], [key1.key2#.key3.key4#]?

However it is still hard to deal with the context. I'm against the overuse of this feature.

vagoff commented 8 years ago

that (together with the semicolons) sounds for me as a creepy yamlification of toml ;)

ghost commented 8 years ago

Why do you have this feeling? Obviously no one would like to cram multiple YAML key-value pairs into one line. Semicolon is just an alternative to newline.

If you don't want to invent a new syntax for arrays of sections, I have another proposal: Disallow defining any other keys of an object [[array]] through section titles [array.key] or [[array.key]]. i.e. This is not allowed:

[[items]]
name = "banana"
[items.colors]  # add "colors" to the item above
yellow = true
green = true
vagoff commented 8 years ago

Why do you have this feeling?

Imagine a man who met INISON a long ago but have no expirience with it since then. Suddenly he get an accedental urge to edit an existing configuration file (or write one from scratch, which is worse). What conceptual revolution occurs inside his mind?

ini files: "Ah! just [section]\nvar=val thing! piece of cake!" json files: "Oh! that map+list+scalar mess! {"a":123,"b":[123]} thing! I can live with that though..." inison files: "Ugh, i see [a]\nb=c thing and I vaguely remember there was some JSON compatibility in values and some intriguing tags facility lurked somewhere and that obscure semicolon rules which I never understand to say frankly"

and now when you're about adding [a.b#.c] stuff...

vagoff commented 8 years ago

Nowadays I think about this brand new configuration file format:

file = definition*

definition
    = prefix? tag value? block
    | prefix? tag value? list
    | prefix? tag value
block = "{" definition* "}"
list = "[" value* "]"
value = FLOAT | INT | STRING | bool

bool = "true" | "false"
tag = IDENT
prefix = IDENT

(credits to GML)

It is amazingly simple and wonderfully readable. I can't understand why it is so readable, maybe because of tags everywhere.

I've implemented it in half of hour via my home grown universal parser generator by the way.

vagoff commented 8 years ago

Full parser source code:

definition =
    IDENT@id1
    (
        IDENT@id2 -> continuation{some(str(id1)),sym(id2)}
        _ -> continuation{none,sym(id1)}
    )

continuation{prefix,tag} =
    '{' -> block@@block [tuple(mktt(prefix,tag,none,some(block)))]
    '[' -> list@@list [tuple(mktt(prefix,tag,none,some(list)))]
    _ -> scalar@val
        (
            '{' -> block@@block [tuple(mktt(prefix,tag,some(val),some(block)))]
            '[' -> list@@list [tuple(mktt(prefix,tag,some(val),some(list)))]
            _ -> [tuple(mktt(prefix,tag,some(val),none))]
        )

block =
    '}' -> []
    EOF -> "unclosed '}'"
    _ -> definition block

list =
    ']' -> []
    EOF -> "error: unterminated list"
    _ -> scalar list

scalar =
    DOUBLE_STRING@s -> [str(s)]
    INT@i -> [int(i)]
    FLOAT@f -> [float(f)]
vagoff commented 8 years ago

Is it beautiful, isn't it?

How to bring such overhelming simplicity and extensibility and readability to INISON?

I have no idea right now.

ghost commented 8 years ago

Examples? What does prefix mean? And no nested arrays?

vagoff commented 8 years ago

Examples?

server "mydomain.com" {
    service "mydaemon1" {
        listen { ip4 "192.168.0.100" port 5555 }
        logfile "/var/log/mydaemon1.log"
    }
}

What does prefix mean?

Value tag can be comprised of two parts. This is useful for two cases:

1) classes:

debug server "myserver1" { ... } production server "myserver2" { .... }

2) say, you have

myrecord { field1 "a" field2 100 field3 true }

you want to put a complex value (typed as Box) into field2

myrecord { field1 "a" field2 Box { w 100 h 50 xoff 5 yoff 10 } field3 true }

And no nested arrays?

Yes, list are just slightly complicated primitive values. All complexity is covered by blocks.

vagoff commented 8 years ago

Apache config file

;; Server configuration
;; Ensure that Apache listens on port 80
;; orig: Listen 80
Listen 80

;; Listen for virtual host requests on all IP addresses
;; orig: NameVirtualHost *:80
NameVirtualHost { addr "*" port 80 }

;; orig: <VirtualHost *:80>
VirtualHost {
    addr "*"
    port 80
    DocumentRoot "/www/example1"
    ServerName "www.example.com"

    ;; Other directives here

} ;; </VirtualHost>

;; <VirtualHost *:80>
VirtualHost {
    addr "*" port 80
    DocumentRoot "/www/example2"
    ServerName "www.example.org"

    ;; Other directives here

} ;; </VirtualHost> 
vagoff commented 8 years ago

or

VirtualHost "*:80" {
    DocumentRoot "/www/example2"
    ServerName "www.example.org"
}
vagoff commented 8 years ago

Samba:

global {
    workgroup "METRAN"
    encrypt_passwords true
    wins_support true
    log_level 1 
    max_log_size 1000
    read_only false
}
homes {
    browsable false
    map_archive true
}
printers {
    path "/var/tmp"
    printable true
    min_print_space 2000
}
share "test" {
    browsable true
    read_only true
    path "/usr/local/samba/tmp"
}
vagoff commented 8 years ago

from Wikipedia INI file:

; last modified 1 April 2001 by John Doe
[owner]
name=John Doe
organization=Acme Widgets Inc.

[database]
; use IP address in case network name resolution is not working
server=192.0.2.62     
port=143
file="payroll.dat"

gives

;; last modified 1 April 2001 by John Doe
owner {
    name "John Doe"
    organization "Acme Widgets Inc."
}
database {
    ;; use IP address in case network name resolution is not working
    server "192.0.2.62"
    port 143
    file "payroll.dat"
}
vagoff commented 8 years ago

from http://json.org/example.html

{
    "glossary": {
        "title": "example glossary",
        "GlossDiv": {
            "title": "S",
            "GlossList": {
                "GlossEntry": {
                    "ID": "SGML",
                    "SortAs": "SGML",
                    "GlossTerm": "Standard Generalized Markup Language",
                    "Acronym": "SGML",
                    "Abbrev": "ISO 8879:1986",
                    "GlossDef": {
                        "para": "A meta-markup language, used to create markup languages such as DocBook.",
                        "GlossSeeAlso": ["GML", "XML"]
                    },
                    "GlossSee": "markup"
                }
            }
        }
    }
}

becomes

glossary "example glossary" {
    GlossDiv "S" {
        GlossList {
            GlossEntry {
                ID "SGML"
                SortAs "SGML"
                GlossTerm "Standard Generalized Markup Language"
                Acronym "SGML"
                Abbrev "ISO 8879:1986"
                GlossDef {
                    para "A meta-markup language, used to create markup languages such as DocBook."
                    GlossSeeAlso ["GML" "XML"]
                }
                GlossSee "markup"
            }
        }
    }
}
vagoff commented 8 years ago

from http://octodecillion.com/blog/json-data-file-format/

{
    "_HEADER":{
            "modified":"1 April 2001",
            "dc:author": "John Doe"
    },
    "logger_root":{
            "level":"NOTSET",
            "handlers":"hand01"
     },
    "logger_parser":{
            "level":"DEBUG",
            "handlers":"hand01",
            "propagate":"1",
            "qualname":"compiler.parser"
    },
    "owner":{
             "name":"John Doe",
             "organization":"Acme Widgets Inc."
     },
     "database":{
             "server":"192.0.2.62",     
             "_comment_server":"use IP address in case network name resolution is not working",
             "port":"143",
             "file":"payroll.dat"
      }
}

becomes

_HEADER {
    modified Date "1 April 2001"
    dc_author "John Doe"
}
logger_root {
    level "NOTSET"
    handlers "hand01"
}
logger_parser {
    level "DEBUG"
    handlers "hand01"
    propagate "1"
    qualname "compiler.parser"
}
owner {
    name "John Doe"
    organization "Acme Widgets Inc."
}
database {
    server "192.0.2.62"
    _comment_server "use IP address in case network name resolution is not working"
    port "143"
    file "payroll.dat"
}
vagoff commented 8 years ago

cookies:

cookie {
    domain "mydomain.com"
    httpOnly false
    name "ECS_ID"
    path "/"
    secure false
    value "999b1a001827d68a113d12b103df8409ec6379b0"
}
cookie {
    domain ".mydomain.com"
    expiry 1447604032
    httpOnly false
    name "_ym_visorc_23974621"
    path "/"
    secure false
    value "w"
}
cookie {
    domain ".mydomain.com"
    expiry 1463154137
    httpOnly false
    name "_ym_uid"
    path "/"
    secure false
    value "14476021371059691474"
}
vagoff commented 8 years ago

some from shootout benchmarks:

platform "arch" {
    language "cpython2" { install "sudo pacman -S python2" }
    language "cpython3" { install "sudo pacman -S python3" }
    language "pypy2" { install "sudo pacman -S pypy" }
    language "pypy3" { install "sudo pacman -S pypy3" }
    language "ghc" { install "sudo pacman -S ghc" }
    language "nim" { install "sudo pacman -S nim" }
    language "julia" { install "sudo pacmnan -S julia" }
    language "sbcl" { install "sudo pacman -S sbcl" }
    language "go" { install "sudo pacman -S go" } ;; [!] how to install gcc-go simultaneously? they are in conflict!
}

platform "ubuntu" {}

platform "debian" {}

platform "freebsd" {}

platform "win32" {}

and

measurement {
    language "python"
    implementation "cpython2"
    function "fac"
    algorithm "fac-loop"
    run "python2 fac-loop.py2 50999"
    active false
}
measurement {
    language "python"
    implementation "cpython3"
    function "fac"
    algorithm "fac-loop"
    run "python3 fac-loop.py3 50999"
    active false
}
measurement {
    language "haskell"
    implementation "ghc"
    function "fac"
    algorithm "fac-rec"
    compile "ghc -O3 fac-rec.hs"
    run "./fac-rec 50999"
    clean "rm fac-rec.hi fac-rec.o"
}
measurement {
    language "haskell"
    implementation "ghc"
    function "fac"
    algorithm "fac-foldr"
    compile "ghc -O3 fac-foldr.hs"
    run "./fac-foldr 50999"
    clean "rm fac-foldr.hi fac-foldr.o"
}

and

report {
    measurement { language "py" build "" command "pypy fac-loop.py2 50999" time 1.83 memory 76128 }
    measurement { language "py" build "" command "pypy3 fac-loop.py3 50999" time 2.01 memory 50468 }
    measurement { language "hs" build "" command "./fac-rec 50999" time 0.92 memory 6232 }
    measurement { language "hs" build "" command "./fac-rec 50999" time 0.92 memory 6256 }
    measurement { language "hs" build "" command "./fac-foldr 50999" time 0.99 memory 6072 }
    measurement { language "hs" build "" command "./fac-product 50999" time 2.67 memory 13272 }
    measurement { language "jl" build "" command "julia fac-rec.jl" time 14.57 memory 88320 }
    measurement { language "py" build "" command "pypy fac-rec.py 50999" time 17.59 memory 105448 }
    measurement { language "py" build "" command "pypy3 fac-rec.py 50999" time 18.69 memory 82804 }
    measurement { language "jl" build "" command "julia loop3.jl" time 315.06 memory 111740 }
    measurement { language "jl" build "" command "julia loop3-i64.jl" time 8.96 memory 78560 }
    measurement { language "py" build "" command "pypy loop3.py2 50999" time 3.72 memory 51576 }
    measurement { language "py" build "" command "pypy3 loop3.py3 50999" time 5.76 memory 29716 }
    measurement { language "nim" build "" command "./fac_rec_i64 20" time 0.00 memory 1416 }
    measurement { language "nim" build "" command "./loop3_i64 100" time 7.89 memory 1436 }
    measurement { language "nim" build "" command "./fac_rec_i64 7" time 0.00 memory 1376 }
    measurement { language "nim" build "" command "./loop3_i64 100" time 0.20 memory 1360 }
}
vagoff commented 8 years ago

Is it enough examples? I have a lot, lot more...

vagoff commented 8 years ago

Parser master build file:

options {}

pipeline "parse_def_file" {
    stage "det/examples/def/lexer.stage" {
        filter "common/line_comments_token.det"
        filter "common/numeric_token.det"
        import "common/double_string.det"
        filter "common/any_double_string_token.det"
        import "common/single_string.det"
        filter "common/any_single_string_token.det"
        import "common/triple_double.det"
        filter "common/triple_double_token.det"
        import "common/triple_single.det"
        filter "common/triple_single_token.det"
        filter "examples/def/identifier_token.det"
        filter "common/whitespace_token.det"
        filter "det/common/copy_token.det"
    }
    stage "det/examples/def/keywords.stage" {
        filter "examples/def/keywords_token.det"
    }
    stage "det/examples/def/syntax.stage" {
        parser "examples/def/def.det"
    }
}
vagoff commented 8 years ago

XinX build file example

package "converter" {

    refnames ["system20.converter"]
    depends ["xinx"]
    targets ["compiler/converter/entrypoint.x:main:Procedure"]
    entrypoint "compiler/converter/entrypoint.x:main:Procedure"

    supply { prefix "compiler/" namespace "system20.converter.internals" }

    namespace "system20.converter.internals" {
        load "compiler/dsl/dsl.ns"
        load "compiler/examples/maryjane/maryjane.ns"
        load "compiler/ir/ir.ns"
        load "compiler/ir/typespec.ns"
        load "compiler/passes/passes.ns"
        load "compiler/converter/converter.ns"
        include "lib.compilers.unique_names"
        import "xinx"
    }

}
ghost commented 8 years ago

Indeed, a good format. I wonder if there is already a standalone parser for this format waiting somewhere, maybe in GitHub. ;-)

So in this new format, every tag corresponds to an array of block/list/scalar? e.g. Duplicates are allowed:

# many servers
server { ... }
server { ... }
...
# nested arrays
row [ ... ]
row [ ... ]
...
# accumulate names
contributor "..."
contributor "..."
contributor "..."
...
vagoff commented 8 years ago

Duplicates are allowed:

Yes! That's a half of whole point ;)


But there are not only rainbows and unicorns... JSON gives us standard nested map/list structure while this format gives us an AST. So, the new format needs some standard postprocessing after parsing which requires some kind of schema to guide it.

vagoff commented 8 years ago

So in this new format, every tag corresponds to an array of block/list/scalar?

In this format, every prefix/tag corresponds to nothing. And this is a problem to solve.

ghost commented 8 years ago

In this format, every prefix/tag corresponds to nothing. And this is a problem to solve.

prefix = hint, tag = class, the top-level structure is an array of objects? This is what I can imagine.

ghost commented 8 years ago

Or just use a walker to traverse the tree and gather what you want?

ghost commented 8 years ago

The definition:

definition
    = prefix? tag value? block
    | prefix? tag value? list
    | prefix? tag value

is an eye candy, and actually can be internally converted to this uniform:

definition = Tag1@Tag2@Tag3 (block | list | value)  ; Tag1 and Tag3 may be empty

to toml:

Key1@Key2@Key3 = [ values ]  # This array is implicitly constructed for duplicates.
ghost commented 8 years ago

I'm not to recommend complex naming for keys here. 8-| I don't want to see this toml variant:

key = 1
key = 2
# gives
# { key: [ 1, 2 ] }
vagoff commented 8 years ago

prefix = hint, tag = class, the top-level structure is an array of objects? This is what I can imagine.

no!

Look at this case:

map_from_strings_to_objects {
    Class1 "Key1" { prop1 "val1" prop2 "val2"  }
    Class2 "Key2" { prop1 "val11" prop2 "val22"  }
}

tags are classes in one location and keys in another.

vagoff commented 8 years ago

Any encoding you propose must in one way or another be transformed by application in form it want to actually use.

Some values used just right after parsing, some shoved in various initialization places, some saved in their complex form (as dict of lists for example) for later use.

Schema driven postprocessor (the stage right after parser) may ease this work for application writer. He only writes schema and all the code is generated by standard schema processor. This is how I see things in ideal world works.

ghost commented 8 years ago

It seems more complicated now. :-| Anyway, what should inison look like? I'm confused.

vagoff commented 8 years ago

This new format (let it be called TreeDef) is just side note. Material for thinking. No direct implications to INISON.

ghost commented 8 years ago

The syntax of TreeDef is too distinct. Its tagging method only shines in nested objects. Currently INISON is almost line-based apart from multi-line strings, and value types "array" and "object". For its simple syntax, I tend to use it for some relatively simple task. Let YAML/JSON/GML/... do the hard work (mostly about deep nested objects). How about further simplifying array/object values?

Then back to the issue about [[a.b.c.d]] from TOML. One big problem: Cannot tell whether a, b or c is also an array easily. It will be easy if only the last key "d" can be an array.

vagoff commented 8 years ago

How about further simplifying array/object values?

Indeed. Got same thought recently.

One big problem: Cannot tell whether a, b or c is also an array easily. It will be easy if only the last key "d" can be an array.

I was sure it is obvious to have [[a.b.c]];x=1 equivalent to a.b.c.push({x:1})

Of course [[a.b]]x=1[[a.b.c]]y=2 must be an error. Does TOML behave differently here?

vagoff commented 8 years ago

Are those [[...]] are really needed? Have you seen its good usage somewhere?

vagoff commented 8 years ago

So we have just [a]x=1 from INI + {a:123,b:[5,6]} from JSON + tags T{x:"y"} and DATE"2015-01-01".

vagoff commented 8 years ago

And ignore every whitespace whatsoever ;)

ghost commented 8 years ago

Just see the examples at the top of this page, TOML do tricky things for [[...]].

[[...]] is used (at the top-level) by the Rust package manager, Cargo: http://doc.crates.io/manifest.html#configuring-a-target

vagoff commented 8 years ago

// Cargo's treated INI keys as keywords in one place and package names in another. Very inelegant.

vagoff commented 8 years ago

Well, let allow [[x.y.z]] for lists only at end then.

vagoff commented 8 years ago

Just for comparison:

package "hello_world" {

    version "0.1.0"    # the current version, obeying semver
    authors ["you@example.com"]
    exclude ["build/**/*.o" "doc/**/*.html"]
    include ["src/**/*" "Cargo.toml"]

    # A short blurb about the package. This is not rendered in any format when
    # uploaded to crates.io (aka this is not markdown)
    description "..."

    # These URLs point to more information about the repository
    documentation "..."
    homepage "..."
    repository "..."

    # This points to a file in the repository (relative to this Cargo.toml). The
    # contents of this file are stored and indexed in the registry.
    readme "..."

    # This is a small list of keywords used to categorize and search for this
    # package.
    keywords ["..." "..."]

    # This is a string description of the license for this package. Currently
    # crates.io will validate the license provided against a whitelist of known
    # license identifiers from http://spdx.org/licenses/. Multiple licenses can
    # be separated with a `/`
    license "..."

    # If a project is using a nonstandard license, then this key may be specified in
    # lieu of the above key and must point to a file relative to this manifest
    # (similar to the readme key)
    license_file "..."

    depends "hammer" { version "0.5.0" git "https://github.com/wycats/hammer.rs" }
    depends "color" { git "https://github.com/bjz/color-rs" }
    depends "geometry" { path "crates/geometry" }

    depends "hammer" { version "0.5.0" }
    depends "color" { version "> 0.6.0, < 0.8.0" }

    target "x86_64-pc-windows-gnu" {
        depends "winhttp" { version "0.4.0" }
    }
    target "i686-unknown-linux-gnu" {
        depends "openssl" { version "1.0.1" }
        native { path "native/i686" }
    }
    target "x86_64-unknown-linux-gnu" {
        depends "openssl" { version "1.0.1" }
        native { path "native/x86_64" }
    }

    target "x86_64/windows.json" {
        depends "winhttp" { version "0.4.0" }
    }

    target "i686/linux.json" {
        depends "openssl" { version "1.0.1" }
        native { path "native/i686" }

    target "x86_64/linux.json" {
        depends "openssl" { version "1.0.1" }
        native { path "native/x86_64" }
    }

    # The development profile, used for `cargo build`
    profile "dev" {
        opt_level 0  # Controls the --opt-level the compiler builds with
        debug true   # Controls whether the compiler passes `-g`
        rpath false  # Controls whether the compiler passes `-C rpath`
        lto false    # Controls `-C lto` for binaries and staticlibs
        debug_assertions true  # Controls whether debug assertions are enabled
        codegen_units 1 # Controls whether the compiler passes `-C codegen-units`
                          # `codegen-units` is ignored when `lto = true`
    }
    # The release profile, used for `cargo build --release`
    profile "release" {
        opt_level 3
        debug false
        rpath false
        lto false
        debug_assertions false
        codegen_units 1
    }

    # The testing profile, used for `cargo test`
    profile "test" {
        opt_level 0
        debug true
        rpath false
        lto false
        debug_assertions true
        codegen_units 1
    }

    # The benchmarking profile, used for `cargo bench`
    profile "bench" {
        opt_level 3
        debug false
        rpath false
        lto false
        debug_assertions false
        codegen_units 1
    }

    # The documentation profile, used for `cargo doc`
    profile "doc" {
        opt_level 0
        debug true
        rpath false
        lto false
        debug_assertions true
        codegen_units 1
    }
}

package "awesome" {

    # The “default” set of optional packages. Most people will
    # want to use these packages, but they are strictly optional.
    # Note that `session` is not a package but rather another
    # feature listed in this manifest.
    feature "default" { packages ["jquery" "uglifier" "session"] }

    # A feature with no dependencies is used mainly for conditional
    # compilation, like `#[cfg(feature "go-faster")]`.
    feature "go_faster" { packages [] }

    # The “secure-password” feature depends on the bcrypt package.
    # This aliasing will allow people to talk about the feature in
    # a higher-level way and allow this package to add more
    # requirements to the feature in the future.
    feature "secure_password" { packages ["bcrypt"] }

    # Features can be used to reexport features of other packages.
    # The `session` feature of package `awesome` will ensure that the
    # `session` feature of the package `cookie` is also enabled.
    feature "session" { packages ["cookie/session"] }

    # These packages are mandatory and form the core of this
    # package’s distribution
    depends "cookie" { version "1.2.0" }
    depends "oauth" { version "1.1.0" }
    depends "route_recognizer" { version "=2.1.0" }

    # A list of all of the optional dependencies, some of which
    # are included in the above “features”. They can be opted
    # into by apps.
    depends "jquery" { version "1.0.2" optional true }
    depends "uglifier" { version "1.5.3" optional true }
    depends "bcrypt" { version "*" optional true }
    depends "civet" { version "*" optional true }

    depends "awesome" {
        version "1.3.5"
        default_features false # do not include the default features, and optionally
                                 # cherry-pick individual features
        features ["secure-password" "civet"]
    }

    lib "foo" {

        # This field points at where the crate is located, relative to the Cargo.toml.
        path "src/lib.rs"

        # A flag for enabling unit tests for this target. This is used by `cargo test`.
        test true

        # A flag for enabling documentation tests for this target. This is only
        # relevant for libraries, it has no effect on other sections. This is used by
        # `cargo test`.
        doctest true

        # A flag for enabling benchmarks for this target. This is used by `cargo bench`.
        bench true

        # A flag for enabling documentation of this target. This is used by `cargo doc`.
        doc true

        # If the target is meant to be a compiler plugin, this field must be set to true
        # for Cargo to correctly compile it and make it available for all dependencies.
        plugin false

        # If set to false, `cargo test` will omit the --test flag to rustc, which stops
        # it from generating a test harness. This is useful when the binary being built
        # manages the test runner itself.
        harness true

    }

    lib "bar" {
        path "src/lib2.rs"
        plugin true
    }

}