How to extract values from nested EBNF clauses?

zaach / jison

Bison in JavaScript.

http://jison.org

4.34k stars 449 forks source link

How to extract values from nested EBNF clauses? #233

Open RubenVerborgh opened 10 years ago

RubenVerborgh commented 10 years ago

Given a rule such as

PropertyListNotEmpty
    : Verb ObjectList ( ';' ( Verb ObjectList )? )* ->  [$1, $2, $3] ;

$1 has the value of the first Verb
$2 has the value of the first ObjectList
$3 has the value [";", ";", …]

How can I get the values of the second (repeated) Verb and ObjectList captures?

ghost commented 10 years ago

I've asked me the same thing but unfortunately couldn't find an answer. For now I'm using additional rules for the nested parts but I hope support for this will come soon because all the additional rules bloat the grammar file and it takes some time to inspect big rules.

ComFreek commented 10 years ago

Some months ago, I got the same thing to work by using the nested arrays provided by $1/2/3/4....

ghost commented 10 years ago

@ComFreek would it be possible to share a piece of code that shows how you accomplished it? I've played around with the same idea but it didn't work because $1/2/3/4... are just arrays containing one string, as stated in the initial example.

ComFreek commented 10 years ago

@lusbuab I can't find a copy of the code on my hard drive right now. I've just tried to make up a contrived example:

/* lexical grammar */
%lex
%%

[0-9]                return 'DIGIT'
"."                  return '.'
<<EOF>>              return 'EOF'
.                    return 'INVALID'

/lex

%start expressions

%% /* language grammar */

expressions
        : number EOF
                {return $1;}
        ;

number
        : (digit)+
        {
                console.log($1, Array.isArray($1));
                $$ = parseInt($1.join(""), 10);
        }
        ;

digit
        : 'DIGIT'
             {$$ = parseInt($1, 10);}
        ;

Funnily enough, $1 does not evaluate to an array (see the console.log output). Has Jison been changed somehow? If I remember correctly, such quantifying expressions evaluate to arrays.

A new problem probably doesn't help you, sorry ;)

RubenVerborgh commented 10 years ago

The example you give is not part of the original issue though. (digit)+ has always been accessible as the array $1. It is things like (a b c)+ that cause trouble: $1 only gives access to an array of as.

besquared commented 9 years ago

+1. What's the right thing to do here? $3 should presumably an array of the entire group and not just of the first token in group yes?

RubenVerborgh commented 9 years ago

Exactly!

besquared commented 9 years ago

I actually found a decent way to do this. It won't make perfect sense since I'm using a JSON grammar with a rule/action wrapper for my alternatives but I think you can see how it might work.

https://gist.github.com/besquared/11751660c97962d2ee55

RubenVerborgh commented 9 years ago

Sure, decomposing the whole thing is possible, that's what I did as a workaround. But it unnecessarily complicates rules, and creates many more of them, blowing up parser size and slowing down processing speed.

The code should really be fixed so that all members of groups of tokens can be accessed.