INISON / inison

INISON -- Simple readable configuration and serialization format.
MIT License
0 stars 0 forks source link

Other interesting designs #16

Open ghost opened 8 years ago

ghost commented 8 years ago

https://github.com/go-ini/ini

though I'm not dealing with config files these days… :stuck_out_tongue_closed_eyes:

ghost commented 8 years ago

ArchieML catches my heart: http://archieml.org/

vagoff commented 8 years ago

A much better than YAML but still too complex: too much rules to keep in mind while using. This raises unnecessary cognitive load to unacceptable level. Perhaps it is okay for users who write everything in this format. But programmers use plenty of languages simultaneously so format of config/settings/db files for them must hit very low bound of syntactic complexity and have near zero time for jumping into workflow from far standing languages.

As for me I pretty much happy with TreeDef (renamed to ObjDef so far) at the moment; I devised bidirectional (not isomorphic) mapping between ObjDef and JSON to make tooling easier and wrote corresponding parsers/converters. Will release after some period of heavy production use.

vagoff commented 8 years ago

ObjDef BNF grammar:

entry
    = type value
    | type contents
    | type id contents

contents = "{" entry* "}"

type = adjective+
id = string
value = string | float | integer | bool

bool = "true" | "false"

ObjDef parser (full source code):

entry =
    IDENT@t ->
        det_parser_bug_workaround{t}@tag
        adjectives@adjectives
        (
            '{' -> contents@@contents [quadruple(tag,adjectives,none,some(contents))]
            _ -> value@value
                (
                    '{' -> contents@@contents [quadruple(tag,adjectives,some(value),some(contents))]
                    _ -> [quadruple(tag,adjectives,some(value),none)]
                )
        )

det_parser_bug_workaround{t} = [sym(t)]

adjectives = adjectives_loop@@aa [mktuple(aa)]

adjectives_loop =
    IDENT@t -> [sym(t)] adjectives_loop
    _ -> []

contents =
    '}' -> []
    _ -> entry contents

value =
    FLOAT@v -> [float(v)]
    INT@v -> [int(v)]
    (DOUBLE_STRING | TRIPLE_STRING)@v -> [str(v)]
    TRUE -> [true]
    FALSE -> [false]
parser "objdef" {
    stage "objdef/parser/staged/lexer.stage" {
        import "det/common/skip_until_newline.det"
        filter "det/common/line_comments_token.det"
        import "det/common/string_char.det"
        import "det/common/double_string.det"
        filter "det/common/any_double_string_token.det"
        import "det/common/single_string.det"
        filter "det/common/any_single_string_token.det"
        ;;import "det/common/triple_double.det"
        ;;filter "det/common/triple_double_token.det"
        ;;import "det/common/triple_single.det"
        ;;filter "det/common/triple_single_token.det"
        filter "det/common/bool_token.det"
        filter "det/common/digit_token.det"
        parser "det/common/copy_token.det"
    }
    stage "objdef/parser/staged/hacky.stage" {
        filter "det/common/numeric_token.det"
        filter "det/common/whitespace_token.det"
        parser "det/common/copy_token.det"
    }
    stage "objdef/parser/staged/syntax.stage" {
        parser "objdef/parser/syntax.det"
    }
}
vagoff commented 8 years ago

(Interesting enough it still lacks of standard ordinary lists and maps; perhaps I'll add them later but not sure how)

vagoff commented 8 years ago

The mapping: https://gist.github.com/vagoff/46619337c18a742f37f672d345fdf104

vagoff commented 8 years ago

That's how ObjDef grammar looks if yout dump it in ObjDef format:

rule "entry" {
    alt {
        seq {
            ref "type"
            ref "value"
        }
        seq {
            ref "type"
            ref "contents"
        }
        seq {
            ref "type"
            ref "id"
            ref "contents"
        }
    }
}

rule "contents" {
    seq {
        lit "{"
        repeat {
            ref "entry"
        }
        lit "}"
    }
}

rule "type" {
    repeat {
        ref "adjective"
        min 1
    }
}

rule "id" {
    ref "string"
}

rule "value" {
    alt {
        ref "string"
        ref "float"
        ref "integer"
        ref "bool"
    }
}

rule "bool" {
    alt {  lit "true" lit "false" }
}
vagoff commented 8 years ago

Nevertheless, thanks for posting the link!

I have learnt a much from your links (especially thx for the \u{deadbeef} thing, I somehow managed to completely miss the trend)

ghost commented 8 years ago

Good point about the transition among different formats. I guess ArchieML is the best combination of YAML and INI/TOML so far.

Indeed, ObjDef lacks a standard syntax for arrays and maps. But since it works best with a schema, you can just let the transformer decide the type name, e.g.

array {
  push "a"
  push "b"
}

map {
  set "a" 1
  set "b" 2
}

however it is not bad to have a concise syntax (["a" "b"], {a: 1, b: 2}). You just need to provide custom containers for them.

vagoff commented 8 years ago

I think exactly the same. Just no urge to add them right how. When I'll met circumstances in which avoiding lists/maps can't be prolonged, I'll take final decision.

ghost commented 8 years ago

I learnt a much from your links (especially thx for the \u{deadbeef} thing, I somehow managed to completely miss the trend)

:-) Now I think \xXX is still the best for bytes, and \u{XXX,XXXXX,XXXX,...} for unicode. \uXXXX and \uXXXXXXXX are just bad.

vagoff commented 8 years ago

Agreed! =)

ghost commented 7 years ago

A good use of semicolons in https://github.com/CESNET/libyang may interests you, and my example:

array { 1; 2; 3; 4 }
map { a 1; b 2; c 3; d 4 }
book {
  title "The Poor Jack";
  description "Long long ago..."
              "There was a poor actor called Jack."
              "Guess what happened on him?";
  price 9.99
}

it is just LISt Processing :-P

jakwings commented 7 years ago

Another one, SimpleDeclarativeLanguage: https://sdlang.org/

Similar to XML but can associate multiple values to a tag. E.g.

// [tagname] [values] [atributes] { [children] }

tag val1 val2 val3 attr1=sth1 attr2=sth2 {
  child { ... }
  child { ... }
}

and anonymous tag (default name: content)

matrix {
  0 0 1  //= content 0 0 1 {}
  0 1 0
  1 0 0
}
vagoff commented 7 years ago

Thanks! Interesting...

vagoff commented 7 years ago

SimpleDeclarativeLanguage:

arbitrary semicolon is interesting design desicion though

vagoff commented 7 years ago

What do you think about lambdas and include directives? (In context of configuration files) They may help to cope with patterns and bring in the DRY.

I peek that idea at Nix Exprs design https://medium.com/@MrJamesFisher/nix-by-example-a0063a1a4c55

It seems as overkill at first moment you think about it but hey, configuration files are formulas written in formal languages just as ordinary source codes are but with "declarative" semantics (declarative language semantics is a definition of function from syntax domain to value domain while ordinary programming language semantics is a definition of function from syntax domain to function from value domain (input) to value domain (output), not too distant as it may seem).

Also, configuration files bring in all the mess like dependencies management, deployment, versioning, using-from-external-project, ... as ordinary source code do. So they may benefit from type systems, module systems and generic dependency handling too. While type systems for configuration files are common, include directives and module systems are not (yet?)

vagoff commented 7 years ago

Better link http://nixos.org/nix/manual/#ch-expression-language

vagoff commented 7 years ago

So every configuration expression is just a program accepting one argument of singleton type "Unit".

Interesting enough, if you take into account considerations from "12 factor app" thing, you end up with statement "configuration expression is program with argument env :: String->String"

vagoff commented 7 years ago

Moreover, if you scrutinize difference between "configuration" and "settings" you notice that configuration file is just (highest) toplevel source code composing all (more or less) generic (library) parts together, edited at customization/maintenance time (not runtime.) so it also static resource.

vagoff commented 7 years ago

Interesting consequence of that point of view that one may employ not only typechecking to verify new configuration is correct but also perform (taking significant time) abstract interpretation-like postcompilation step to make program optimized for this very specific configured use case.

vagoff commented 7 years ago

Also note that since Nix-expr-like configuration language is not Turing-complete it may be verified 100% correct even without type signatures, and all safety theorems autoproved (because Rice Theorem and Halting Problem do not hold for such a weak language)

vagoff commented 7 years ago

Style 1:

pc1(i) = ... # Program Component #1
pc2(i) = ... # Program Component #2
parse_cf(fp) = readfile(fp) == "variant1"
main(i) = if parse_cf("filepath.cfg") then pc1(i) else pc2(i)

Style 2:

pc1(i) = ... # Program Component #1
pc2(i) = ... # Program Component #2
parse_cf(fp) = if readfile(fp) == "variant1" then pc1 else pc2
main(i) = parse_cf("filepath.cfg")(i)
jakwings commented 7 years ago

Only the application can tell what is the appropriate methods to include files/texts. We don't even need a standard syntax for it. E.g. (YAML)

# include/inherit/…whatever
template: [user encrypted]

# patch/override
user:
  name: Bob
encryption:
  key: bob.key

Lambdas or functions are convenient too. But it makes the watershed between "data" and "script/program".

To make the parser simpler and easier to implement, without creating another Turing-complete language, I would prefer simple transformers (implementation-defined):

duration1 = 1000  # milliseconds
duration2 = (10).seconds  # not directly parsing string "10 seconds"

to make it declarative like the "recursive sets" in nix:

# key = [variable | value] [.[transformer] | [binary-op] [variable | value]]...
duration2 = (1).seconds + (duration * 2)  # lazy substitution, instead of yaml-like reference

and without user-defined lambdas like func1 = .seconds + (10).minutes.


Also note that since Nix-expr-like configuration language is not Turing-complete it may be verified 100% correct even without type signatures with all safety theorems autoproved (because Rice Theorem and Halting Problem do not hold for such a weak language)

Really not Turing-complete? Nix looks like a weaker Tcl language. It is overkill for data-centric configuration.

jakwings commented 7 years ago

Style 1:

pc1(i) = ... # Program Component #1
pc2(i) = ... # Program Component #2
parse_cf(fp) = readfile(fp) == "variant1"
main(i) = if parse_cf("filepath.cfg") then pc1(i) else pc2(i)

Style 2:

pc1(i) = ... # Program Component #1
pc2(i) = ... # Program Component #2
parse_cf(fp) = if readfile(fp) == "variant1" then pc1 else pc2
main(i) = parse_cf("filepath.cfg")(i)

Seems that we are talking about the same thing. But user-defined functions are still concerning me. I need to think more about it.

jakwings commented 7 years ago

Basically, I doubt whether it is worth the effort to create another scripting language for advanced configuration. Everything like that may become just another very specific DSL.

vagoff commented 7 years ago

| Really not Turing-complete? Nix looks like a weaker Tcl language.

I plan to ban recursion and recursive let definition, that's result in surely not Turing Complete lang.

| It is overkill for data-centric configuration. Yes! Plenty of apps need no more than bare old ini-files. But! We have continuum of configuration file use cases, and at opposite direction we see quite convoluted configurations, see build files for example. So as I can guess we are all be in profit after choosing weak enough language to cover all the cases while maintaining simplicity at most simple uses.

vagoff commented 7 years ago

| Everything like that may become just another very specific DSL.

It is language for record manipulation for generic configuration purposes, so in one sense it is very narrow, but in other sense, as configuration language, it is very broadly applicable. So it is not so straightforwardly "very specific" as it seems.

vagoff commented 7 years ago

| Only the application can tell what is the appropriate methods to include files/texts

We can lift file inclusion to generic, configuration language level, and track dependencies/loading automatically in the library.

vagoff commented 7 years ago

Look. You've written program A and B. They needs its own configuration files. I'm building enormous software complex with one instance of A and two instances of B included. Inclusion/lambdas will be overkill for you, but must-have for me in this case.

vagoff commented 7 years ago

| Basically, I doubt whether it is worth the effort to create another scripting language for advanced configuration. Everything like that may become just another very specific DSL.

I have a bunch of tools that make creation of simple DSL a breese. No effort at all.

All effort is in decent design.

vagoff commented 7 years ago

| But user-defined functions are still concerning me

After ban of any recursion and while loops, user-defined funtions become safe.

(Foreaches are ok)

vagoff commented 7 years ago

| I would prefer simple transformers (implementation-defined)

Thing that bothers me is that in SOME cases you end up with convoluted intangible mess of transformers. Why not cover these cases too if we can do it without making language clumsier?

vagoff commented 7 years ago

My first shot on grammar:

expr1 = lambda | expr2
lambda = pat ":" stmt+

stmt = if_stmt | let_stmt | expr1
if_stmt = "if" expr "then" stmt+ "else" stmt+
let_stmt = pat "=" expr

expr2 = infix | let_expr | expr3
infix = expr2 infix_name expr2
let_expr = "let" stmt+ "in" expr2

expr3 = app | expr4
app = expr3 expr4

expr4 = getfield_expr | expr5
getfield_expr = expr4 "." name

expr5 = scalar | name | "(" expr1 ")" | list_expr | record_expr
record_expr = "{" (setfield_expr ";")* setfield_expr? "}"
setfield_expr = name ("=" expr1)?
list_expr = "[" (expr1 ";")* expr1? "]"

pat = name | at_pat | record_pat | list_pat
at_pat = name "@" pat
record_pat = "{" (field_pat ",")* field_pat? "}"
field_pat = name ("=" pat)? | name "?" expr1
list_pat = "[" (pat ",")* pat? "]"

(I suppose working codename for the lang is "Top", as for "most toplevel layer of app")

vagoff commented 7 years ago

(there are problem with lambda priority, I'm thinking about it)

vagoff commented 7 years ago

For example, configuration language like this is powerful enough to describe Feed-forward Convolution Neural Network architecture. In more ordinary configuration languages (like ObjDef/TreeDef) that looks like total mess. Utterly unreadable.

vagoff commented 7 years ago

(Side note: I'm not in any way adocating for something, I just want to follow the Truth; so if I (while doing research) see the trails of Truth make a tight turn so do I)

jakwings commented 7 years ago

Never mind. And I want to know the balance point, even if it leads to one minimal and one more advanced minimal designs.

Your use case is broader than mine, I just need some time to digest the information. :-P

vagoff commented 7 years ago

I just want to get a "Final Solution of the Configuration Files Question" =)

I hate it when problems keep reiterating after once solved. We just never be able to make step forward if we spent all the time on reiterated problems.

jakwings commented 7 years ago

Your grammar above is full of recursive definitions, I don't quite get it. Is it mimicking Nix?

vagoff commented 7 years ago

This is subset of Nix Expr language except:

Grammar without stmt stuff:

expr1 = lambda | expr2
lambda = pat ":" expr1

expr2 = let_expr | expr3
let_expr = "let" (pat "=" expr1 ";")* (pat "=" expr1)? "in" expr2

expr3 = infix | expr4
infix = expr3 infix_name expr3

expr4 = app | expr5
app = expr4 expr5

expr5 = getfield_expr | expr6
getfield_expr = expr5 "." name

expr6 = scalar | name | "(" expr1 ")" | list_expr | record_expr
record_expr = "{" (setfield_expr ";")* setfield_expr? "}"
setfield_expr = name ("=" expr1)?
list_expr = "[" (expr1 ";")* expr1? "]"

pat = name | at_pat | record_pat | list_pat
at_pat = name "@" pat
record_pat = "{" (field_pat ",")* field_pat? "}"
field_pat = name ("=" pat)? | name "?" expr1
list_pat = "[" (pat ",")* pat? "]"
vagoff commented 7 years ago

Stmt thing was for writing

if f1 then
    a = b

instead of

let a = if f1 then b else a in

I finally think this feature must be dropped.

vagoff commented 7 years ago

and perhaps change

setfield_expr = name ("=" expr1)?

to

setfield_expr = (name ".")* name ("=" expr1)?
vagoff commented 7 years ago

Useful link https://github.com/jwiegley/hnix.git

vagoff commented 7 years ago

Update: import expr, with expr and index expr added.

expr1 = lambda | let_expr | expr2
lambda = pat ":" expr1
let_expr = "let" (pat "=" expr1 ";")* (pat "=" expr1)? "in" expr2

expr2 = with_expr | expr3
with_expr = "with" expr3 ";" expr2

expr3 = infix | expr4
infix = expr3 infix_name expr3

expr4 = app | expr5
app = expr4 expr5

expr5 = getfield_expr | index_expr | expr6
getfield_expr = expr5 "." name
index_expr = expr5 "." "[" expr1 "]"

expr6 = scalar | name | "(" expr1 ")" | list_expr | record_expr | import_expr
record_expr = "{" (setfield_expr ";")* setfield_expr? "}"
setfield_expr = (name ".")* name ("=" expr1)?
list_expr = "[" (expr1 ";")* expr1? "]"
import_expr = "<" path ">"

pat = name | at_pat | record_pat | list_pat
at_pat = name "@" pat
record_pat = "{" (field_pat ",")* field_pat? "}"
field_pat = name ("=" pat)? | name "?" expr1
list_pat = "[" (pat ",")* pat? "]"
vagoff commented 7 years ago

Translated some ObjDef/TreeDef configurations in Top form to get closer look-and-feel https://gist.github.com/vagoff/958c54567b9b150059ca2ed185848ea5

jakwings commented 7 years ago

{stage,import,filter,parser}: # import entry constructors

That's indeed a good improvement. The interface is clear and in the forefront now.

jakwings commented 7 years ago

I must admit I don't like the syntax of Nix-lang unless I must deal with it every day.

Feature-ful is nice but all putting together just looks like JavaScript spaghetti. (Why not JS?)

vagoff commented 7 years ago

| I don't like the syntax of Nix-lang

Some particular points/features or it is inexplicable overall feeling? Do we have possibility to make it more beautiful?

| spaghetti

Yes, this is a serious drawback :/

| Why not JS?

Because JS is Turing-complete so we have no any guarantees with it. Nix(Top) is very weak so it is possible not only guarantee 100% safety but also infer upper and lower bounds on number of execution "steps".

vagoff commented 7 years ago

| just looks like JavaScript spaghetti.

Something to add:

Amount of code and cognitive burden/easiness is vastly different notions. I learnt that hard way, through Haskell expierence and reading source codes of Ur/Web compiler. Very large code sometimes may be a MUCH more accessible than some convoluted tiny masterpiece built on top of high theoretical notions. Also it is easier to debug, maintain and extend.

jakwings commented 7 years ago

Especially the pattern matching part. The syntax being too compact makes abuses of advanced features easier, and reduces readability. Regarding "pure", some imperative instructions will become horrible cascading of let ... in ..., even if there are lambdas (but there are no pre-declarations of functions).

So people should know what should not be done solely with those features. Like the Nix project, the actual build jobs (mostly?) are delegated to the posix shell and makefiles. I don't think this language can take the place of make/cmake/premake/...

A good example for styling is MacPorts: using tcl (everything is a string ;-) and imperative style (and impure), but not much fancy stuff. With good styling + auxiliary functions (with an imperative style) it is possible to make Nix/Top look nice.