usethesource / rascal

The implementation of the Rascal meta-programming language (including interpreter, type checker, parser generator, compiler and JVM based run-time system)
http://www.rascal-mpl.org
Other
406 stars 78 forks source link

Improve support for optional non terminals #625

Closed DavyLandman closed 10 years ago

DavyLandman commented 10 years ago

The current mantra is not to use implode anymore, and I'm with you guys on this journey.

However, around optional non-terminals concrete syntax hurts.

Take this "grammar":

lexical AB = "a" B? b;
lexical B = "b";

now it is very hard to check if b was "parsed" or not.

rascal>t = parse(#AB, "a");
sort("AB"): `a`
Tree: appl(prod(lex("AB"),[lit("a"),label("b",opt(lex("B")))],{}),[appl(prod(lit("a"),[\char-class([range(97,97)])],{}),[char(97)]),appl(regular(opt(lex("B"))),[])[@loc=|file://-|(1,0,<1,1>,<1,1>)]])[@loc=|file://-|(0,1,<1,0>,<1,1>)]
rascal>t has b
bool: true

rascal>(B)`b` := t.b
|stdin:///|(10,1,<1,10>,<1,11>): Expected lex("B"), but got opt(lex("B"))

rascal>(AB)`ab` := t
bool: false

or let's try to match B:

rascal>t = parse(#AB,"ab");
sort("AB"): `ab`
Tree: appl(prod(lex("AB"),[lit("a"),label("b",opt(lex("B")))],{}),[appl(prod(lit("a"),[\char-class([range(97,97)])],{}),[char(97)]),appl(regular(opt(lex("B"))),[appl(prod(lex("B"),[lit("b")],{}),[appl(prod(lit("b"),[\char-class([range(98,98)])],{}),[char(98)])])[@loc=|file://-|(1,1,<1,1>,<1,2>)]])[@loc=|file://-|(1,1,<1,1>,<1,2>)]])[@loc=|file://-|(0,2,<1,0>,<1,2>)]

rascal>(AB)`ab` := t
bool: true

rascal>(B)`b` := t.b
|stdin:///|(10,1,<1,10>,<1,11>): Expected lex("B"), but got opt(lex("B"))

rascal>(B?)`b` := t.b
null: Syntax error: concrete syntax fragment
☞ Advice

so I think we need to make sure has looks into the tree instead of the production. Since now it ends up at writing out all the variants in concrete syntax. ((AB)ab := t and (AB)a := t)

jurgenvinju commented 10 years ago

On Thu, Jul 31, 2014 at 4:09 PM, Davy Landman notifications@github.com wrote:

The current mantra is not to use implode anymore, and I'm with you guys on this journey.

However, around optional non-terminals concrete syntax hurts.

Take this "grammar":

lexical AB = "a" B? b; lexical B = "b";

now it is very hard to check if b was "parsed" or not.

rascal>t = parse(#AB, "a"); sort("AB"): a Tree: appl(prod(lex("AB"),[lit("a"),label("b",opt(lex("B")))],{}),[appl(prod(lit("a"),[\char-class([range(97,97)])],{}),[char(97)]),appl(regular(opt(lex("B"))),[])[@loc=|file://-|(1,0,1,1>,<1,1>)]])[@loc=|file://-|(0,1,<1,0,<1,1>)] rascal>t has b bool: true

rascal>(B)b := t.b |stdin:///|(10,1,<1,10>,<1,11>): Expected lex("B"), but got opt(lex("B"))

rascal>(AB)ab := t bool: false

or let's try to match B:

rascal>t = parse(#AB,"ab"); sort("AB"): ab Tree: appl(prod(lex("AB"),[lit("a"),label("b",opt(lex("B")))],{}),[appl(prod(lit("a"),[\char-class([range(97,97)])],{}),[char(97)]),appl(regular(opt(lex("B"))),[appl(prod(lex("B"),[lit("b")],{}),[appl(prod(lit("b"),[\char-class([range(98,98)])],{}),[char(98)])])[@loc=|file://-|(1,1,1,1>,<1,2>)]])[@loc=|file://-|(1,1,<1,1,1,2>)]])[@loc=|file://-|(0,2,<1,0,<1,2>)]

rascal>(AB)ab := t bool: true

rascal>(B)b := t.b |stdin:///|(10,1,<1,10>,<1,11>): Expected lex("B"), but got opt(lex("B"))

rascal>(B?)b := t.b null: Syntax error: concrete syntax fragment ☞ Advice

so I think we need to make sure has looks into the tree instead of the production. Since now it ends up at writing out all the variants in concrete syntax. ((AB)ab := t and (AB)a := t)

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/625.

Jurgen Vinju

DavyLandman commented 10 years ago

t.b? always returns true.

rascal>parse(#AB,"a").b?
bool: true

rascal>parse(#AB,"ab").b?
bool: true
DavyLandman commented 10 years ago

or with should you meant, in the future?

jurgenvinju commented 10 years ago

yes, in the future!

On Thu, Jul 31, 2014 at 5:16 PM, Davy Landman notifications@github.com wrote:

or with should you meant, in the future?

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/625#issuecomment-50773222.

Jurgen Vinju

tvdstorm commented 10 years ago

This is correct behavior: the optional is always there, even if B isn't. T


Tijs van der Storm Researcher CWI

On 31 jul. 2014, at 17:10, Davy Landman notifications@github.com wrote:

t.b? always returns true.

rascal>parse(#AB,"a").b? bool: true

rascal>parse(#AB,"ab").b? bool: true — Reply to this email directly or view it on GitHub.

jurgenvinju commented 10 years ago

the idea was to specialize the ? behavior to work on parse trees and return false, even though the B? is there, if the B isn't. good idea or not? For now there is no way to find out except pattern matching explicitly on empty and on the non-empty ones. The constructor names for the empty and non-empty rule (present/absent) do not work because of the flattening of regular expressions. We may also support instead of ? this: x.b is present, but this will require some explanation (where does "present" come from suddenly).

On Fri, Aug 1, 2014 at 5:40 PM, Tijs van der Storm <notifications@github.com

wrote:

This is correct behavior: the optional is always there, even if B isn't. T


Tijs van der Storm Researcher CWI

On 31 jul. 2014, at 17:10, Davy Landman notifications@github.com wrote:

t.b? always returns true.

rascal>parse(#AB,"a").b? bool: true

rascal>parse(#AB,"ab").b? bool: true — Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/625#issuecomment-50899183.

Jurgen Vinju

tvdstorm commented 10 years ago

Bad idea. It's not consistent. T


Tijs van der Storm Researcher CWI

On 1 aug. 2014, at 17:52, "Jurgen J. Vinju" notifications@github.com wrote:

the idea was to specialize the ? behavior to work on parse trees and return false, even though the B? is there, if the B isn't. good idea or not? For now there is no way to find out except pattern matching explicitly on empty and on the non-empty ones. The constructor names for the empty and non-empty rule (present/absent) do not work because of the flattening of regular expressions. We may also support instead of ? this: x.b is present, but this will require some explanation (where does "present" come from suddenly).

On Fri, Aug 1, 2014 at 5:40 PM, Tijs van der Storm <notifications@github.com

wrote:

This is correct behavior: the optional is always there, even if B isn't. T


Tijs van der Storm Researcher CWI

On 31 jul. 2014, at 17:10, Davy Landman notifications@github.com wrote:

t.b? always returns true.

rascal>parse(#AB,"a").b? bool: true

rascal>parse(#AB,"ab").b? bool: true — Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/625#issuecomment-50899183.

Jurgen Vinju

  • Centrum Wiskunde & Informatica - SEN1
  • INRIA Lille - ATEAMS
  • Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju — Reply to this email directly or view it on GitHub.

jurgenvinju commented 10 years ago

I see. What about this:

syntax A = "a" B b ? "c";

now here b is really there optionally and A a = ...; a.b? should not be inconsistent right?

On Fri, Aug 1, 2014 at 5:56 PM, Tijs van der Storm <notifications@github.com

wrote:

Bad idea. It's not consistent. T


Tijs van der Storm Researcher CWI

On 1 aug. 2014, at 17:52, "Jurgen J. Vinju" notifications@github.com wrote:

the idea was to specialize the ? behavior to work on parse trees and return false, even though the B? is there, if the B isn't. good idea or not? For now there is no way to find out except pattern matching explicitly on empty and on the non-empty ones. The constructor names for the empty and non-empty rule (present/absent) do not work because of the flattening of regular expressions. We may also support instead of ? this: x.b is present, but this will require some explanation (where does "present" come from suddenly).

On Fri, Aug 1, 2014 at 5:40 PM, Tijs van der Storm < notifications@github.com

wrote:

This is correct behavior: the optional is always there, even if B isn't. T


Tijs van der Storm Researcher CWI

On 31 jul. 2014, at 17:10, Davy Landman notifications@github.com wrote:

t.b? always returns true.

rascal>parse(#AB,"a").b? bool: true

rascal>parse(#AB,"ab").b? bool: true — Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/625#issuecomment-50899183.

Jurgen Vinju

  • Centrum Wiskunde & Informatica - SEN1
  • INRIA Lille - ATEAMS
  • Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju — Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/625#issuecomment-50901276.

Jurgen Vinju

mahills commented 10 years ago

I agree with @tvdstorm that this would be inconsistent -- it is correct that b is always present as a field, and I assume this is also true of any underlying nonterminal constructor which would be created (or we would need two different underlying productions, one with and one without the value). Is there a reason we can't just treat this as an option type? If we could consistently say something like t.b is some or t.b is none this would give us the ability to query it without needing an explicit match.

tvdstorm commented 10 years ago

No. The optional is now unlabeled.

B b? a

Could work: t.a.b?

But this is ugly.

T


Tijs van der Storm Researcher CWI

On 1 aug. 2014, at 18:03, "Jurgen J. Vinju" notifications@github.com wrote:

I see. What about this:

syntax A = "a" B b ? "c"; 

now here b is really there optionally and A a = ...; a.b? should not be inconsistent right?

On Fri, Aug 1, 2014 at 5:56 PM, Tijs van der Storm <notifications@github.com

wrote:

Bad idea. It's not consistent. T


Tijs van der Storm Researcher CWI

On 1 aug. 2014, at 17:52, "Jurgen J. Vinju" notifications@github.com wrote:

the idea was to specialize the ? behavior to work on parse trees and return false, even though the B? is there, if the B isn't. good idea or not? For now there is no way to find out except pattern matching explicitly on empty and on the non-empty ones. The constructor names for the empty and non-empty rule (present/absent) do not work because of the flattening of regular expressions. We may also support instead of ? this: x.b is present, but this will require some explanation (where does "present" come from suddenly).

On Fri, Aug 1, 2014 at 5:40 PM, Tijs van der Storm < notifications@github.com

wrote:

This is correct behavior: the optional is always there, even if B isn't. T


Tijs van der Storm Researcher CWI

On 31 jul. 2014, at 17:10, Davy Landman notifications@github.com wrote:

t.b? always returns true.

rascal>parse(#AB,"a").b? bool: true

rascal>parse(#AB,"ab").b? bool: true — Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/625#issuecomment-50899183.

Jurgen Vinju

  • Centrum Wiskunde & Informatica - SEN1
  • INRIA Lille - ATEAMS
  • Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju — Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/625#issuecomment-50901276.

Jurgen Vinju

  • Centrum Wiskunde & Informatica - SEN1
  • INRIA Lille - ATEAMS
  • Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju — Reply to this email directly or view it on GitHub.

jurgenvinju commented 10 years ago

So far magic constructors of the Maybe kind seem most preferable. Autocomplete may help. — Jurgen J. Vinju CWI SWAT INRIA Lille UvA master software engineering http://jurgen.vinju.org

On Fri, Aug 1, 2014 at 7:28 PM, Tijs van der Storm notifications@github.com wrote:

No. The optional is now unlabeled. B b? a Could work: t.a.b? But this is ugly.

T

Tijs van der Storm Researcher CWI

On 1 aug. 2014, at 18:03, "Jurgen J. Vinju" notifications@github.com wrote:

I see. What about this:

syntax A = "a" B b ? "c"; 

now here b is really there optionally and A a = ...; a.b? should not be inconsistent right?

On Fri, Aug 1, 2014 at 5:56 PM, Tijs van der Storm <notifications@github.com

wrote:

Bad idea. It's not consistent. T


Tijs van der Storm Researcher CWI

On 1 aug. 2014, at 17:52, "Jurgen J. Vinju" notifications@github.com wrote:

the idea was to specialize the ? behavior to work on parse trees and return false, even though the B? is there, if the B isn't. good idea or not? For now there is no way to find out except pattern matching explicitly on empty and on the non-empty ones. The constructor names for the empty and non-empty rule (present/absent) do not work because of the flattening of regular expressions. We may also support instead of ? this: x.b is present, but this will require some explanation (where does "present" come from suddenly).

On Fri, Aug 1, 2014 at 5:40 PM, Tijs van der Storm < notifications@github.com

wrote:

This is correct behavior: the optional is always there, even if B isn't. T


Tijs van der Storm Researcher CWI

On 31 jul. 2014, at 17:10, Davy Landman notifications@github.com wrote:

t.b? always returns true.

rascal>parse(#AB,"a").b? bool: true

rascal>parse(#AB,"ab").b? bool: true — Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/625#issuecomment-50899183.

Jurgen Vinju

  • Centrum Wiskunde & Informatica - SEN1
  • INRIA Lille - ATEAMS
  • Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju — Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/625#issuecomment-50901276.

Jurgen Vinju

  • Centrum Wiskunde & Informatica - SEN1
  • INRIA Lille - ATEAMS
  • Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju —

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub: https://github.com/cwi-swat/rascal/issues/625#issuecomment-50911863

tvdstorm commented 10 years ago

Another option inspired by implode :-)

Interpret optionals as enumerable, just like * and +, then you can do:

if (B b <- pt.b) {
  ...
}
jurgenvinju commented 10 years ago

makes sense

On Fri, Aug 1, 2014 at 10:24 PM, Tijs van der Storm < notifications@github.com> wrote:

Another option inspired by implode :-)

Interpret optionals as enumerable, just like * and *, then you can do:

if (B b <- pt.b) { ... }

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/625#issuecomment-50930725.

Jurgen Vinju

PaulKlint commented 10 years ago

Since B? is equivalent to B | epsilon, why not introducing an empty predicate on non-terminals and writing tests like empty(t.b) and !empty(t.b)

jurgenvinju commented 10 years ago

I think this can be done! Does it work?

bool present(&T? opt) = appl(regular(_), []) !:= opt; — Jurgen J. Vinju CWI SWAT INRIA Lille UvA master software engineering http://jurgen.vinju.org

On Fri, Aug 1, 2014 at 11:48 PM, Paul Klint notifications@github.com wrote:

Since B? is equivalent to B | epsilon, why not introducing an empty predicate on non-terminals and writing tests like empty(t.b) and !empty(t.b)

Reply to this email directly or view it on GitHub: https://github.com/cwi-swat/rascal/issues/625#issuecomment-50938633

DavyLandman commented 10 years ago

Nope that doesn't work:

bool present(&T? opt) = appl(regular(_), []) !:= opt;
java.lang.RuntimeException: Symbol has unknown type: AST debug info: org.rascalmpl.ast.Sym$Parameter at |prompt:///|(13,2,<1,13>,<1,15>)(internal error)    at $shell$(|main://$shell$|)
java.lang.RuntimeException: Symbol has unknown type: AST debug info: org.rascalmpl.ast.Sym$Parameter at |prompt:///|(13,2,<1,13>,<1,15>)
    at org.rascalmpl.interpreter.utils.Symbols.symbolAST2SymbolConstructor(Symbols.java:180)
    at org.rascalmpl.interpreter.utils.Symbols.symbolAST2SymbolConstructor(Symbols.java:126)
    at org.rascalmpl.interpreter.utils.Symbols.typeToSymbol(Symbols.java:54)
    at org.rascalmpl.interpreter.types.NonTerminalType.<init>(NonTerminalType.java:52)
    at org.rascalmpl.interpreter.types.RascalTypeFactory.nonTerminalType(RascalTypeFactory.java:42)
    at org.rascalmpl.semantics.dynamic.Type$Symbol.typeOf(Type.java:114)
    at org.rascalmpl.semantics.dynamic.Expression$TypedVariable.typeOf(Expression.java:2731)
    at org.rascalmpl.semantics.dynamic.Formals$Default.typeOf(Formals.java:41)
    at org.rascalmpl.semantics.dynamic.Parameters$Default.typeOf(Parameters.java:34)
    at org.rascalmpl.semantics.dynamic.Signature$NoThrows.typeOf(Signature.java:56)
    at org.rascalmpl.interpreter.result.RascalFunction.<init>(RascalFunction.java:86)
    at org.rascalmpl.semantics.dynamic.FunctionDeclaration$Expression.interpret(FunctionDeclaration.java:136)
    at org.rascalmpl.semantics.dynamic.Declaration$Function.interpret(Declaration.java:133)
    at org.rascalmpl.semantics.dynamic.Command$Declaration.interpret(Command.java:37)
    at org.rascalmpl.interpreter.Evaluator.eval(Evaluator.java:1152)
    at org.rascalmpl.interpreter.Evaluator.eval(Evaluator.java:1021)
    at org.rascalmpl.interpreter.Evaluator.eval(Evaluator.java:975)
    at org.rascalmpl.shell.RascalShell.handleInput(RascalShell.java:145)
    at org.rascalmpl.shell.RascalShell.run(RascalShell.java:116)
    at org.rascalmpl.shell.RascalShell.main(RascalShell.java:188)
DavyLandman commented 10 years ago

together with @tvdstorm we have implemented support for iterating over the opt node:

syntax AB = "A"? aatje "B";
t = parse(#AB, "B");
t2 = parse(#AB, "AB");

now the magic:

rascal>_ <- t.aatje
bool: false

rascal>_ <- t2.aatje
bool: true
mahills commented 9 years ago

This is now supported by the type checker as well.

PaulKlint commented 9 years ago

The compiler does not yet support this.