orcmid / miser

The Miser Project is practical demonstration of computation-theoretical aspects of software
https://orcmid.github.io/miser/
Apache License 2.0
3 stars 1 forks source link

oFrugal grammar? #8

Closed rnd0101 closed 3 years ago

rnd0101 commented 6 years ago

There are several examples of what is called frugalese, but is there any grammar or formal description for that?

orcmid commented 6 years ago

Resolution

After experimental confirmations carried out with Roman Susi (@rnd0101), the expression, ob-exp, of obs in a reference grammar employed for oFrugal has stabilized at ob-exp.txt.


There is no published grammar or formal description, although I have notes in support of producing those.

There are essentially two levels.

  1. The applicative-expression sugaring of the notation for construction of obs will be formalized for the oFrugal REPL implementation. Aspects of this have already been illustrated in ob.txt and comments under Question #2. This will be worked on first, undoubtedly in stages, along with mockups that build toward the generic oFrugal REPL. An important achievement, here, is the ability to present results with appropriate sugarings. This will be a challenge in the developments to be described under Question #7, for example. This should probably be termed the oFrugalese level, since its utility is in assembly of obs interpreted as applicative expressions.

  2. The generic Frugalese is intended to be higher-level and permit familiar expressions in which obs arise only as a default/simple-datatype in the presence of type inference. The notation is closer to that of ISWIM as presented in Landin's "The Next 700 Programming Languages." This is at about the same level as SML (and equivalent OCAML and F#) definitions.

There is an important difference in the level (1-2) notations. Whereas functional programming systems tend to treat f g h x as ((f g) h) x, the association of functions and their arguments in Miser has the interpretation be that of f(g(h x)) in this case. Of course, arentheses can always be used for explicit grouping.

For example, the combinator S is described in Frugalese as being such that (S x) y) z = x(z) y z, using only essential parentheses.

rnd0101 commented 6 years ago

Thanks! I was mainly targeting notation 1 for now. Here my take at frugal's grammar: https://github.com/rnd0101/miser/tree/master/oMiser/mockups/python but I need to add "translator" to Python yet, so that my miser can digest frugal expressions directly. (I do not have lot of time to work on this, so it gets so fragmented and surely looks from your point of view as jumping from topic to topic)

rnd0101 commented 6 years ago

I also think, that probably these two notations need to be more explicit in the texts. If it's not something stable, there is probably no need for explicit ENBF or grammar, examples are enough, but I must confess it confused me on first reading. (back then I assumed it's described somewhere, so I just went on).

rnd0101 commented 6 years ago

I've made a simple frugal shell with my best understanding of the grammar. (it's not ideal, does not handle corner cases well, but allows to experiment):

$ python shell.py 
oMiser/Frugal syntax interpreter
Press Ctrl-D to leave.

oMiser> .C ‵.C (.C (.E .C (.E .ARG) ‵.ARG) ‵(.C (.E .ARG) ‵.ARG) ) 
INPUT: (.C :: (`(.C) :: (.C :: ((.E :: (.C :: ((.E :: .ARG) :: `(.ARG)))) :: `((.C :: ((.E :: .ARG) :: `(.ARG))))))))

OUTPUT: (.C :: (`((`(.ARG) :: .ARG)) :: (.C :: ((.E :: .ARG) :: `(.ARG)))))

oMiser> 
oMiser> .C ‵.C  (.C  (.E  .C (.E .ARG) ‵.ARG) ‵(.C (.E .ARG) ‵.ARG) )
INPUT: (.C :: (`(.C) :: (.C :: ((.E :: (.C :: ((.E :: .ARG) :: `(.ARG)))) :: `((.C :: ((.E :: .ARG) :: `(.ARG))))))))
ev (.SELF) (.ARG) (`(.C)) -> .C
ev (.SELF) (.ARG) (.E) -> .E
ev (.SELF) (.ARG) (.E) -> .E
ev (.SELF) (.ARG) (.ARG) -> .ARG
ap (.E) (.ARG) -> `(.ARG)
ev (.SELF) (.ARG) ((.E :: .ARG)) -> `(.ARG)
ev (.SELF) (.ARG) (`(.ARG)) -> .ARG
ev (.SELF) (.ARG) ((.C :: ((.E :: .ARG) :: `(.ARG)))) -> (`(.ARG) :: .ARG)
ap (.E) ((`(.ARG) :: .ARG)) -> `((`(.ARG) :: .ARG))
ev (.SELF) (.ARG) ((.E :: (.C :: ((.E :: .ARG) :: `(.ARG))))) -> `((`(.ARG) :: .ARG))
ev (.SELF) (.ARG) (`((.C :: ((.E :: .ARG) :: `(.ARG))))) -> (.C :: ((.E :: .ARG) :: `(.ARG)))
ev (.SELF) (.ARG) ((.C :: ((.E :: (.C :: ((.E :: .ARG) :: `(.ARG)))) :: `((.C :: ((.E :: .ARG) :: `(.ARG))))))) -> (`((`(.ARG) :: .ARG)) :: (.C :: ((.E :: .ARG) :: `(.ARG))))
ev (.SELF) (.ARG) ((.C :: (`(.C) :: (.C :: ((.E :: (.C :: ((.E :: .ARG) :: `(.ARG)))) :: `((.C :: ((.E :: .ARG) :: `(.ARG))))))))) -> (.C :: (`((`(.ARG) :: .ARG)) :: (.C :: ((.E :: .ARG) :: `(.ARG)))))

OUTPUT: (.C :: (`((`(.ARG) :: .ARG)) :: (.C :: ((.E :: .ARG) :: `(.ARG)))))

oMiser> 

Update: Now I grasped it:

oMiser>((`(.C ‵.C  (.C  (.E  .C (.E .ARG) ‵.ARG) ‵(.C (.E .ARG) ‵.ARG) )) "x") "y") "z"
OUTPUT: (("x" :: "z") :: ("y" :: "z"))

In a way, it's somewhat backwards to write (but makes perfect sense!):

((`cS "x") "y") "z"

But then the result is equivalent to:

oMiser> ("x" "z") "y" "z"
OUTPUT: (("x" :: "z") :: ("y" :: "z"))
orcmid commented 6 years ago
oMiser>((`(.C ‵.C  (.C  (.E  .C (.E .ARG) ‵.ARG) ‵(.C (.E .ARG) ‵.ARG) )) "x") "y") "z"
OUTPUT: (("x" :: "z") :: ("y" :: "z"))

and

oMiser> ("x" "z") "y" "z"
OUTPUT: (("x" :: "z") :: ("y" :: "z"))

Is exactly it (with quoted strings as lindies).

rnd0101 commented 6 years ago

Ok! So I've implemented grammar for assignment and many other things (see here) (comments and multi-line input not yet possible)

In addition, I implemented frugal-level only statement sequence:

"x"; ob ^z = "z"

(hopefully, this will not make you feel bad about)

However, application operations are now faked as before (::), and are waiting proper encoding decision (see TODOs in the frugal module).

One idea is that (a,b,c) can be backwards-consified (compared to a list). This will allow to use either standalone or when used with application - something will be inserted into innermost left pair (PEG parser I use automatically disambiguates the grammar, so I am not quite sure whether there are any "dangling else" cases hidden.)

That is, what could be good use for (x,y,z) as here:

cS(x,y,z) = ap(ap(ap(cS,x),y),z) = ap(cS, ?(x, y, z))

in a standalone fashion: (x,y,z) so that ap also worked naturally?

I believe, it is better to have separate "reading" and "evaluation" phases, just as in LISP, so some encoding is needed (for "apply", "arg list", and for "assignment"). Great bonus will be of course if that encoding will be an ob as well.

Thank you for challenges! Hopefully, this helps rather than distracts you.

orcmid commented 6 years ago

@rnd0101: I believe, it is better to have separate "reading" and "evaluation" phases, just as in LISP, so some encoding is needed (for "apply", "arg list", and for "assignment"). Great bonus will be of course if that encoding will be an ob as well.

Forms of this view have been expressed in Question #10 and now Question #11. Unless there is a serious difficulty, there will be no adjustment in the following respect.

  1. It is absolutely the case that obs are immutable and oMiser will have no assignment operation. There are serious theoretical issues in doing otherwise. The goal of having the oMiser level is to be more mathematically tractable and also clearly deterministic, as a springboard for determining what is accomplishable before introducing side-effect-related characteristics.

  2. There is application already in the evaluation of obs as scripts, with both obap.eval(exp) or obap.ap(p,x) where exp, p, and x are obs. The (dynamic) distinction between operators (p) and operands (x) interpretations is meant to be clear.

orcmid commented 6 years ago

@rnd0101: That is, what could be good use for (x,y,z) as here: cS(x,y,z) = ap(ap(ap(cS,x),y),z) = ap(cS, ?(x, y, z)) in a standalone fashion: (x,y,z) so that ap also worked naturally?

The idea that f(x,y,z) simply be sugar for (((f x) y) z) goes back to Church and has been used by others. For oMiser there are no "tuple" operands signified by (x,y,z), although one can have the different pattern g[x, y, z] where the [x, y, z] is a single ob operand (and the ob.NIL ending the pairing construction is implicit).

In oFrugal notation, the ( ... ) serve entirely to provide for syntactical grouping, and the f (x, y, .., z) case is sugar for simplifying the grouping of operands. This is particularly valuable because the plain association of applications is to the right as with f g h x being read as f(g(h x))

rnd0101 commented 6 years ago

Probably there is some misunderstanding here (or maybe I still missed it clearly written somewhere). I do not suggest to add anything to oMiser.

I am trying to figure out how to encode parsed structure in oFrugal, so that it will be correctly evaluated. By that I mean :: and juxtaposition are now grammatically separated, but they are not separated in the structure parser produces.

There are several options, for example, do obap.ap already in parser or make special oFrugal classes, that record where application is to happen, so later oFrugal's super-eval will be able to do obap.ap where due and then apply obap.eval to the rest. That is, super-eval (or Frugal-eval) will basically work plumbing obs between evaluations.

I do not like doing it in the parser, because I think expression should be validated before starting any computations.

orcmid commented 6 years ago

@rnd0101: In addition, I implemented frugal-level only statement sequence: "x"; ob ^z = "z"

Nicely done. I introduced the ";" in my examples to disambiguate multi-line input. But having multiple statements in a line is fine too. Nice catch. There will need to be some thought on how output works for a multi-statement line.

rnd0101 commented 6 years ago

f(x,y,z) simply be sugar for (((f x) y) z) Yes, this is understood. I do not suggest anything to add to oMiser (like tuples). I want to keep syntax simple, so if (x, y, z) can have any compatible meaning when not used next to another term, then there will be any syntax error less.

as for

oMiser> "f" "g" "h" "x"
INPUT: ("f" :: ("g" :: ("h" :: "x")))
INPUT: c(L('f'), c(L('g'), c(L('h'), L('x'))))
OUTPUT: c(L('f'), c(L('g'), c(L('h'), L('x'))))
OUTPUT: ("f" :: ("g" :: ("h" :: "x")))
OUTPUT STATE: ('f', ('g', ('h', 'x')))

So, for frugal, the plumbing will be something like: hypothetical ("f" ** ("g" ** ("h" ** "x"))) with ** for ap operations will cause the following work with obs:

_hx = ap("h", "x")
_ghx = ap("g", _hx)
_fghx = ap("f", _ghx)
orcmid commented 6 years ago

@rnd0101: I am trying to figure out how to encode parsed structure in oFrugal, so that it will be correctly evaluated. By that I mean :: and juxtaposition are now grammatically separated, but they are not separated in the structure parser produces. ... I do not like doing it in the parser, because I think expression should be validated before starting any computations.

It is indeed possible to translate every oFrugal ob-exp form expression to an ob, exp, that is then evaluated via eval(exp). It would be an interesting exercise to construct that. However, it would be far more work, in the construction, and the eval(exp) would be pretty heavy-duty. Also, all of the bindings would be substituted as part of the expansion to exp.

One of the powers of oMiser is the ability to have scripts that build scripts and that do what is termed "partial evaluation" in functional-programming.

  1. It would be a shame to deny oFrugal of the opportunity to employ oMiser on behalf of the construction of an ob -- that is, having computation of ob (parts) be available.

  2. There is another potential advantage. Only oFrugal can provide syntax error and partial-evaluation error reporting. And if an application requested by oFrugal fails to terminate, so will any post-translation eval alternative.

  3. It is much easier to specify the semantics of oFrugal ob-exp expressions if we can use obap.ap there.

I apologize that my slowness is impeding resolution of some of these matters in your eagerness to advance this. Your efforts have been very helpful but I may have to leave some of the confusing matters to the orderly treatment (such as a precise grammar) where these can be made clear.

orcmid commented 6 years ago

@rnd0101: I do not like doing it in the parser, because I think expression should be validated before starting any computations.

You make an important point. The oFrugal REPL should probably be designed to ensure that.

My thinking is that the specification of the semantics of an ob-exp would not change, and we need to see how the parser might accomplish that. There are classic ways to handle that, as in the separation of parser and code generator (in this case, ob generator). That can still be efficient.

In mockups, we might not do this so well. The purpose of mockups is proof-of-concept and experimental confirmation of ideas, not production-quality software releases.

rnd0101 commented 6 years ago

Thank you for reply! True, in proof-of-concept it's not needed. As we are dealing with immutable obs, time of execution probably does not matter. Well, the user experience may differ due to more cryptic error messages, but at least experiments can be carried out.

rnd0101 commented 6 years ago

So I just put obap.ap into parser to quickly test and voi la - the difference is clear:


oMiser> ((^cS "x") "y") "z"
(.C :: (`((`("x") :: .ARG)) :: (.C :: ((.E :: .ARG) :: `(.ARG)))))
((`("x") :: .ARG) :: (`("y") :: .ARG))
(("x" :: "z") :: ("y" :: "z"))
INPUT: (("x" :: "z") :: ("y" :: "z"))
OUTPUT: (("x" :: "z") :: ("y" :: "z"))

oMiser> ((^cS :: "x") :: "y") :: "z"
((.C :: (`(.C) :: (.C :: ((.E :: (.C :: ((.E :: .ARG) :: `(.ARG)))) :: `((.C :: ((.E :: .ARG) :: `(.ARG)))))))) :: "x")
(((.C :: (`(.C) :: (.C :: ((.E :: (.C :: ((.E :: .ARG) :: `(.ARG)))) :: `((.C :: ((.E :: .ARG) :: `(.ARG)))))))) :: "x") :: "y")
((((.C :: (`(.C) :: (.C :: ((.E :: (.C :: ((.E :: .ARG) :: `(.ARG)))) :: `((.C :: ((.E :: .ARG) :: `(.ARG)))))))) :: "x") :: "y") :: "z")
INPUT: ((((.C :: (`(.C) :: (.C :: ((.E :: (.C :: ((.E :: .ARG) :: `(.ARG)))) :: `((.C :: ((.E :: .ARG) :: `(.ARG)))))))) :: "x") :: "y") :: "z")
OUTPUT: (`(.ARG) :: `("y"))

(those prints before INPUT are results of all in-parser evaluations)

rnd0101 commented 6 years ago

Now added also that f(x, y, z) syntax, quite hacky, but works:

oMiser> ^cS ("x", "y", "z")
INPUT: (("x" :: "z") :: ("y" :: "z"))
OUTPUT: (("x" :: "z") :: ("y" :: "z"))

oMiser> ((^cS :: "x") :: "y") :: "z"
INPUT: ((((.C :: (`(.C) :: (.C :: ((.E :: (.C :: ((.E :: .ARG) :: `(.ARG)))) :: `((.C :: ((.E :: .ARG) :: `(.ARG)))))))) :: "x") :: "y") :: "z")
OUTPUT: (`(.ARG) :: `("y"))

oMiser> ((^cS "x") "y") "z"
INPUT: (("x" :: "z") :: ("y" :: "z"))
OUTPUT: (("x" :: "z") :: ("y" :: "z"))

oMiser> ("x", "y", "z")
ERROR: Ob expected, found: [<frugal.ArgumentList object at 0x7ff6ddc1d050>]
orcmid commented 6 years ago

@rnd0101: So I just put obap.ap into parser to quickly test and voi la - the difference is clear.

Although the reason the input and output are the same in the first example is because of the pure-lindy-trace rule, so the eval you do makes no difference.

The eval done in the second case tears things up. I think, with your REPL, you need

oMiser> (( ‵^cS :: "x") :: "y") :: "z"

but there really shouldn't be an eval.

I realize that I have broken the whole eval separation in the REPL idea. The only way to force eval is with an application 😉 \:

oFrugal> ob ^SXYZ = (( ‵^cS :: "x") :: "y") :: "z";
oFrugal> ^SXYZ .NIL;

should do it.

It is time to step back and work through the grammar and its semantics from the top down now. I will get to that over the next couple of days, after addressing household weekend chores 😄

rnd0101 commented 6 years ago

Ok. Now it works as written in Wikipedia:

oMiser> (^SKK (^cS)) ("x", "y", "z")
INPUT: (("x" :: "z") :: ("y" :: "z"))
OUTPUT: (("x" :: "z") :: ("y" :: "z"))

oMiser> ob ^cI = (^cS ^cK) ^cK
ASSIGN INPUT: cI = ((`((.E :: .ARG)) :: .ARG) :: (`((.E :: .ARG)) :: .ARG))

oMiser> ^cI "x"
INPUT: "x"
OUTPUT: "x"

oMiser> ob ^SII = (^cS ^cI) ^cI
ASSIGN INPUT: SII = ((`(((`((.E :: .ARG)) :: .ARG) :: (`((.E :: .ARG)) :: .ARG))) :: .ARG) :: (`(((`((.E :: .ARG)) :: .ARG) :: (`((.E :: .ARG)) :: .ARG))) :: .ARG))

oMiser> ^SII "x"
INPUT: ("x" :: "x")
OUTPUT: ("x" :: "x")

Plus this is perhaps as expected:

oMiser> ^SII ^SII
Parsing error: RuntimeError: maximum recursion depth exceeded while calling a Python object
rnd0101 commented 6 years ago

I realize that I have broken the whole eval separation in the REPL idea. The only way to force eval is with an application

Ah, ok. I seem not to keep up with recent developments. I will remove eval over the result.

rnd0101 commented 6 years ago

I'd liked to share one meta-level comment: I believe in design of programming languages, and each design has some main principles it is guided by.

For example, I like Python, because it has syntax, which is designed to be readable. Is based on least-surprise principle. It's possible to write with speed of thought.

So I think design decisions need to be based on some set of principles, be them implicit or explicit. Even "look like <insert your favorite PL here>", "should feel good personally to <someone>" will do. As in every design, the audience should be taken into account. Who they are? Mathematicians, theoretical CS scholars, programmers?

I can only speak for myself: I have (distant) Eternal goal in mind, and it's foundational language (or technology). I am not concerned with whatever programmers are currently in favor of, be it paradigm or syntax preferences. Personally I feel oMiser is more or less ok for Eternal, but certain power-ups are needed on both expressivity ("speed to higher levels" - ability of the language to climb syntactically and in abstraction levels Forth and LISP excel at) and speed of computations (or "optimizability"). And so far it was interesting and fascinating how oFrugal with it quite simple grammar makes use of oMiser machinery. After that pair vs application mishap resolved there were many wows! along the way.

rnd0101 commented 6 years ago

Small practical issue @orcmid : syntax for commands.

Eg, include "something" conflicts with quoteless lindies and if there will be strings. (are commands even part of frugal syntax? maybe good idea to be)

orcmid commented 6 years ago

After considerable discussion, including on Question #14, here is a straightforward syntax, in original BNF, for ob-exp, the oFrugal expressions that evaluate to yield single obs. I'm using a screen capture to avoid any browser-font limitation on the Unicode characters that are used in this form of BNF.

f18xx019-2018-01-29-2134-ob-exp-bnf

Update 2018-01-29: corrected typos caught by @rnd0101 at Question #14

The complete formal definition of the grammar along with specification of the evaluation is maintained at ob-exp.txt.

rnd0101 commented 6 years ago

Wow! Now that there is a grammar and interpretation, writing oFrugal shell is a matter of expresing grammar for some lexer/parser. I will probably try Antlr4.2 as it support operator associations, so there will be no need to fight with left recursion. The (quicker?) library parsimonious will need some hacks to avoid left-recursion, and I prefer when grammar is represented as directly as possible because then it will be easier to change together with ob-exp if such necessity arises.

Please, also note one question above on how the grammar can accommodate shell commands more naturally (include, ob =, and other custom commands, eg, I have graph and solve).

orcmid commented 6 years ago

@rnd0101 Please, also note one question above on how the grammar can accommodate shell commands more naturally (include, ob =, and other custom commands, eg, I have graph and solve).

I had preferred to just use an informal grammar for statements at this point. I can see how to provide a formal interpretation of a session, that is, for all well-formed inputs through a round of REPL inputs from start-up of oFrugal to ending, but it is getting a bit far ahead.

For now, I think the only statements we need are three:

include "file-reference";

( reads the file as if entered at the keyboard. The bindings are updated as appropriate )

ob binding-name = ob-exp ;

( evaluates ob-exp in the current bindings context and retains that ob as bound to the binding-name )

ob-exp ;

( simply evaluates the ob-exp and reports the ob that the evaluation yields. )

The command line initiation of the oFrugal REPL can have file-references to start with.

There is a hash-bang convention:

! oFrugal semantic-version optional-options

Also, there are no reserved words. So the ob binding-definition needs to depend on phrase up to and including the "=".

Those are my only thoughts at the moment.

rnd0101 commented 6 years ago

ok. What about opening up a good way for "vendor-specific" commands? In my implementation there are some interactive options etc and it could be good not to conflict with lindies.

One way can be: :command <arguments>

(I think, include can have the same syntax for consistency)

Added new grammar using Lark parser: https://github.com/rnd0101/miser/blob/master/oMiser/mockups/python/ofrugal_grammar.py , but not yet connected to shell as interpretation part not ready yet.

orcmid commented 6 years ago

@rnd0101 ok. What about opening up a good way for "vendor-specific" commands? One way can be: :command <arguments>

I don't think I have any problem with that. I want include to be part of the standard notation, because it can occur in the included files also.

If you want your implementation-specific commands to be in included files as well as direct REPL statements, I suggest you use the vendor-specific flavor for those.

I suspect ":" at the beginning of the line should be fine. I don't expect that any term will begin with that. On the other hand, a "::" could end up at the beginning of a line of a multi-line expression. Watch out for that, or choose a different command-line trigger character.