antlr / antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
http://antlr.org
BSD 3-Clause "New" or "Revised" License
17.29k stars 3.3k forks source link

Error with MatchWildcard and Array collecting - Go target #1610

Open millergarym opened 7 years ago

millergarym commented 7 years ago

Before submitting an issue to ANTLR, please check off these boxes:

There appear to be many issues using wildcard matches and one using += with the Go language target. See example grammars and associated errors.

Wildcard and += Example 1a - Malformed grammar

grammar Ex;
e : A b+=.  ; // mismatch between += and . - mistake should have be b+=.+ 
A : 'a';

The generate Go code compiler errors eg

./ex_parser.go:85: duplicate method SetB
./ex_parser.go:88: duplicate method GetB
./ex_parser.go:129: cannot use s.b (type antlr.Token) as type []antlr.Token in return argument
./ex_parser.go:186: too many errors

Example 1b - Well formed grammar - capture wildcard

grammar Ex;
e : A b+=.+ ;
A : 'a';

Compile errors

# github.com/wxio/antlr4-go-examples/issues/parser
./ex_parser.go:87: duplicate method SetB
./ex_parser.go:90: duplicate method GetB
./ex_parser.go:131: (*EContext).GetB redeclared in this block
        previous declaration at ./ex_parser.go:127
./ex_parser.go:131: method redeclared: EContext.GetB
        method(*EContext) func() antlr.Token
        method(*EContext) func() []antlr.Token
./ex_parser.go:131: cannot use s.b (type antlr.Token) as type []antlr.Token in return argument
./ex_parser.go:133: (*EContext).SetB redeclared in this block
        previous declaration at ./ex_parser.go:129
./ex_parser.go:133: method redeclared: EContext.SetB
        method(*EContext) func(antlr.Token)
        method(*EContext) func([]antlr.Token)
./ex_parser.go:133: cannot use v (type []antlr.Token) as type antlr.Token in assignment:
        []antlr.Token does not implement antlr.Token (missing GetChannel method)
./ex_parser.go:160: cannot use NewEContext(p, p.BaseParser.GetParserRuleContext(), p.BaseParser.BaseRecognizer.GetState()) (type *EContext) as type IEContext in assignment:
        *EContext does not implement IEContext (wrong type for GetB method)
                have GetB() antlr.Token
                want GetB() []antlr.Token
./ex_parser.go:194: impossible type assertion:
        *EContext does not implement IEContext (wrong type for GetB method)
                have GetB() antlr.Token
                want GetB() []antlr.Token
./ex_parser.go:194: too many errors

Multiple Wildcards in rule Example 2 - Multiple Wildcards

grammar Ex;
e : A b=. 
  | A b=. c=.;
A : 'a';

compile error

./ex_parser.go:209: _mwc redeclared in this block
        previous declaration at ./ex_parser.go:203

Referring to captured arrays Example grammar - with error

grammar Ex;
e : A b+=e+ { $e };
A : 'a';

Antlr error

error(67): Ex.g4:2:15: missing attribute access on rule reference e in $e

Example grammar - that works

grammar Ex;
e : A b+=e+ { localctx.(*EContext).GetB() };
A : 'a';
millergarym commented 7 years ago

Thx.

Is $e.ctx in the documentation? Does the same issue exist in other language targers?

On Wednesday, 18 January 2017, Sam Harwell notifications@github.com wrote:

Definitely some bugs here. Note that in your last case you should have used $e.ctx instead of just $e, which causes the error you saw.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/1610#issuecomment-273391251, or mute the thread https://github.com/notifications/unsubscribe-auth/AAu9Mok9mRgt9JBqV8La6XS4_kq0FCGFks5rTapDgaJpZM4Lmgda .

sharwell commented 7 years ago

@millergarym It's hard to tell in your last example what reference you're trying to make. You can use $e.ctx to get the last element of b. You can use $ctx to reference the enclosing rule.

willfaught commented 7 years ago

Can you explain what the += and +=.+ and =. and +=e+ stuff means? Sorry, I'm not currently very familiar with the ANTLR grammar features/syntax, and explaining here can help me look into this quicker than me pouring through the documentation to figure out what your examples mean.

millergarym commented 7 years ago

@sharwell As mentioned this one works.

grammar Ex;
e : A b+=e+ { localctx.(*EContext).GetB() }; 
A : 'a';

I would have expected the follow two statements to be equivalent

e : A b+=e+ { localctx.(*EContext).GetB() }; 

and

e : A b+=e+ { $b }; 

The type of GetB() is []antlr.Token.

This would be consistent with the behaviour of e : A b=c { $b } ;.

@willfaught

+= captures tokens into an array.

e : a+=x a+=y a+=z ;

Here a would be a slice (array)

a+=x+ captures all xs.

Note a=x+ is a commonly made mistake (at least one I have made a few times). Here only the last x is captured. Would be nice if this generated a warning, similar to the greedy warning with .*

. is a wildcard match. .+ is a greedy 1 or more match. .+? is a non-greedy one.

Hope this helps.

pboyer commented 7 years ago

@millergarym I didn't notice this earlier. Thank you for the detailed repro steps. I'll try and take a closer look soon.

omac777 commented 7 years ago

https://github.com/antlr/grammars-v4/blob/master/cpp/CPP14.g4

antlr4 -Dlanguage=Go CPP14.g4 -visitor

"./parser/" contains the antlr4 generated .go files.

I got similar errors to "/ex_parser.go:209: _mwc redeclared in this block previous declaration at ./ex_parser.go:203" when attempting to compile it with "go build myt3app.go"

myt3app.go:

package main

import (
    "github.com/antlr/antlr4/runtime/Go/antlr"
    "./parser"
    "os"
    "fmt"
)

type TreeShapeListener struct {
    *parser.BaseCPP14Listener
}

func NewTreeShapeListener() *TreeShapeListener {
    return new(TreeShapeListener)
}

func (this *TreeShapeListener) ExitTranslationunit(ctx *parser.TranslationunitContext) {
    fmt.Printf("ExitTranslationunit()...\n")
    fmt.Printf("ctx Text:<<%s>>\n", ctx.GetText())
    fmt.Printf("ctx:<<%q>>\n", ctx)
    fmt.Printf("GetChildren():<<%q>>\n\n", ctx.GetChildren())

    for i := 0; i < ctx.GetChildCount(); i++ {
        child := ctx.GetChild(i)
        parentR, bGetParent := child.GetParent().(antlr.RuleNode)
        if ( (bGetParent == false ) || ( parentR.GetBaseRuleContext() != ctx.GetBaseRuleContext() ) ) {
            //panic("Invalid parse tree shape detected.")
            fmt.Printf("invalid parse tree shape.\n")
            os.Exit(1)
        } else {
            fmt.Printf("valid parse tree shape.\n")
            fmt.Printf("\n")
            //sExpression := ctx.AllExpression().GetText(); 
            //sSymbol := ctx.AllExpression().GetSymbol();
            //sExpression := ctx.Expression(0).GetText(); 
            //sSymbol := ctx.Expression(0).GetSymbol();

            //fmt.Printf("ctx.GetLiteralNames():<<%q>>\n", ctx.GetLiteralNames())
            //fmt.Printf("ctx.GetSymbolicNames():<<%q>>\n", ctx.GetSymbolicNames())

            // String id = ctx.ID().getSymbol(); 
            // String value = ctx.STRING().getSymbol();
            //expression relop expression
        }
        //fmt.Printf("parentR.GetBaseRuleContext():<<%s>>\n", parentR.GetBaseRuleContext())
        //fmt.Printf("ctx.GetBaseRuleContext():<<%s>>\n", ctx.GetBaseRuleContext())
    }   
}

func (this *TreeShapeListener) EnterEveryRule(ctx antlr.ParserRuleContext) {
    fmt.Printf("EnterEveryRule()...\n")
    //fmt.Printf("ctx Text:<<%s>>\n", ctx.GetText())
    //fmt.Printf("ctx:<<%q>>\n", ctx)
    //doesn't work fmt.Printf("ToStringTree:<<%s>>\n", ctx.ToStringTree(parser.rulenames))
    //fmt.Printf("GetChildren():<<%q>>\n\n", ctx.GetChildren())

    for i := 0; i < ctx.GetChildCount(); i++ {
        child := ctx.GetChild(i)
        parentR, bGetParent := child.GetParent().(antlr.RuleNode)
        if ( (bGetParent == false ) || ( parentR.GetBaseRuleContext() != ctx.GetBaseRuleContext() ) ) {
            //panic("Invalid parse tree shape detected.")
            fmt.Printf("invalid parse tree shape.\n")
            os.Exit(1)
        }
        fmt.Printf("parentR.GetBaseRuleContext():<<%s>>\n", parentR.GetBaseRuleContext())
        fmt.Printf("ctx.GetBaseRuleContext():<<%s>>\n", ctx.GetBaseRuleContext())
    }   
}

func main() {
    input, _ := antlr.NewFileStream(os.Args[1])
    lexer := parser.NewCPP14Lexer(input)
    stream := antlr.NewCommonTokenStream(lexer,0)
    p := parser.NewCPP14Parser(stream)
    p.AddErrorListener(antlr.NewDiagnosticErrorListener(true))
    p.BuildParseTrees = true
    tree := p.Translationunit()
    antlr.ParseTreeWalkerDefault.Walk(NewTreeShapeListener(), tree)
}