Closed nwhetsell closed 7 years ago
@$
is the yylloc info result of the rule, similar to $$
being the value result.
Previously (0.4.18-xxx) you got away with this in most circumstances because 0.4.18 run-time always performed the old 'default actions':
$$ = $1;
@$ = @1;
before executing any rule action code chunk, so that $$
and @$
acted as identical copies of $1
and @1
. v0.6.0-XXX doesn't do this any more (for run-time performance reasons) and only injects default actions when there's no user code at all (which is more like the bison behaviour, incidentally).
The correct way to reference any rule production term, is by numeric or named reference, e.g.
character: 'CHARACTER' { console.log(@1); }
;
but I never use the quoted token names as that is only extra overhead cost, hence your example would be written as (showcasing 'named reference' for CHARACTER, by the way):
%lex
%%
. return 'CHARACTER';
/lex
%start characters
// %token CHARACTER <-- is *implicit* due to the way I wrote the character rule below
%%
characters
: character
| characters character
;
character: CHARACTER
{ console.log(@CHARACTER); }
;
The next snippet is a self-contained executable, if you compile it with jison --main
(see jison --help
):
%lex
%%
. return 'CHARACTER';
/lex
%start characters
// %token CHARACTER <-- is *implicit* due to the way I wrote the character rule below
%%
characters
: character
| characters character
;
character: CHARACTER
{ console.log(@CHARACTER); } // or use `@1` to ref the 1st term in the rule production
;
%%
// compile with `jison --main test.jison`
// then execute this `main()` with `node test.js`
parser.main = function () {
parser.parse('abc');
};
BTW:
@$
is useful when you construct your own location reference for the rule, e.g.
rule: a b c d e f
{ @$ = yylexer.mergeLocationInfo(##a, ##f); $$ = $a + $b + $c + $d + $e + $f; }
##
is a jison-specific advanced use prefix: it transforms the production term reference name/number into the index number for the parse stack(s) yyvstack
(value stack, i.e. the array carrying all the $a/$b/$c/...
values), yylstack
(location infos @a,@b,@c,...
), yysstack
(parse state stack, in case you want to use these internals), etc.
$a
/ $1
produces the value of the term (for a terminal, this is the value returned by the lexer in yytext
in the matching lexer rule)
@a
/ @1
produces the location info (which MAY be NULL if you're not careful: this is related to the default action code injection and worth a chapter or at least a book section in itself 😉 )
#a
/ #1
(jison-specific, not available in bison/yacc/etc.) produces the token number for this term (handy when you have an external token definition set and have %import
-ed that one into the lexer and the parser: this way you have one numeric value representing the same (terminal) token everywhere in your app code)
##a
/ ##1
(jison-specific, not available in bison/yacc/etc.) produces the stack index number as mentioned above: this is handy when you're doing advanced stuff that needs full disclosure of the parser internals, particularly all parse stacks.
I actually am trying to access the location of the rule result, the “left hand side grouping” described in http://www.gnu.org/software/bison/manual/html_node/Actions-and-Locations.html#Actions-and-Locations. From that page:
…there is a default action for locations that is run each time a rule is matched. It sets the beginning of
@$
to the beginning of the first symbol, and the end of@$
to the end of the last symbol.
I’m not sure that @$
was the same as @1
in Jison v0.4.18-186 and earlier. For example, here is a test of a Jison grammar created with Jison v0.4.18-184:
The part of the test of the object created from new parser.Instrument
(https://github.com/nwhetsell/linter-csound/blob/0494a0650bc93cbdd8a656239dd4376d42844754/lib/csound-parser/spec/orchestra-parser-spec.js#L482-L505) checks how this part of the Jison grammar—
—parses this text (from https://github.com/nwhetsell/linter-csound/blob/0494a0650bc93cbdd8a656239dd4376d42844754/lib/csound-parser/spec/orchestra-parser-spec.js#L457-L459)
instr A440
outc oscili(0.5 * 0dbfs, 440, giFunctionTableID)
endin
The first argument of the parser.Instrument
constructor is @$
. The test shows that the value of @$
in Jison v0.4.18-184 covered several lines starting at the instr
token and ending at the endin
token. If @$
was the same as @1
, it would’ve been on one line, and only covered the instr
token.
Is there any way to restore the default action for locations — that is, to bring back the old meaning of @$
— in Jison v0.6.0? Or do I need to pass the first and last token locations to my AST node constructors to get location information?
Nothing is set in stone. :wink:
Given your need (and mine, which is a bit different), I am thinking about what would be the best way to tackle this. The main reason to simplify/remove the default actions (particularly when you also have user actions for the given rule production) is speed: most grammars I know and use don't employ this behaviour but 'most' is certainly not 'everyone'.
The/my goal (fast parsers and lexers) is tackled through 'code analysis' and resulting selective code strippin/rewriting, which is now done using straight text manipulation via regexes and .replace() calls, remains as-is, this would imply that I need to have some sort of user-configurable 'switch' where I can simply select preferred behaviour in a few aspects. Maybe something like
%options default-action='<setting>'
where setting
can be one of the following: "none", "bison", "ast".
Or slightly more specific, as there's default value actions and default location actions:
%options default-action='<valuesetting>,<locationsetting>'
// valuesetting: none, bison, ast --> $$=undefined, $$=$1, $$=[$1....$n]
// locationsetting: none, bison --> @$=undefined, @$=merge(@1..@n)
Anyway, that's just pondering how this might be made to look at the jison usage level. Making it happen is another story: I'm looking into migrating the code analysis and rewriting/stripping to a recast/jscodeshift-based process, but that is non-trivial, so will take some time.
The quick fix for you would be to recover (most of) the old behaviour but that would negate all my performance gains through simplified code as most of my grammars don't use these default actions, thus they are surplus=overhead for me. Let me think about it for a bit, maybe I come up with a decent 'quick fix' overnight.
Is there an option I can use to get this code to run? That looks like it should restore the default action.
Not yet, except of course writing that code every explicit action, which is a bother and doesn't help produce readable grammars.
The mistake I made is bundling $$
and @$
treatment together (see for related material: https://github.com/zaach/jison/issues/343): right now in build 188/189, when you have an explicit action for a rule, it is detected and ALL default actions are not executed, while the only breaking change compared to https://github.com/zaach/jison/issues/343 which should have been done is kill the $$=$1
default action.
Anyway, I'm working on this right now to provide a better (and when the user wants, backwards compatible) behaviour re default action injection in the generated grammar.
release 0.6.0-191 has reverted to default 'classic' default behaviour ($$ = $1; @$ = merged;
) and supports other default action modes (see jison --help
for more on the new --default-action
option).
This should fix this issue.
$$
and @$
use in jison grammar actions since release 0.6.0-191)Also note that jison now has better/tighter user-written action code analysis to help ensure a proper rule value result $$
and location result @$
are produced:
when the user action code is absent OR does not set any of these ($$
and @$
), then default action code will be injected for every action block = every rule production which doesn't have these assignments.
In other words: if your code doesn't provide a $$
and/or @$
value, then jison will. (Unless your default mode has been configured as skip
for the one(s) missing.)
If the action code analyzer determines that you happen to access (read) the $$
and/or @$
values before assigning a value to them in the same action block, then the relevant default action will be injected before your action code, again depending on selected --default-action
mode.
In other words: if your code doesn't provide a $$
and/or @$
value to work with, then jison will. (Unless your default mode has been configured as skip
for the one(s) missing.)
Note that from now on $$
is fundamentally different from $0
and happens to be more in line with the way yacc/bison treat(ed) this.
Also do note that read-accessing $$
and @$
before writing (assigning a value) to it in an action code block is very peculiar coding behaviour and certainly non-portable, not even in the yacc/bison world outside jison! (The same goes for $0
references, which are discouraged action code behaviour even though it is documented in a corner of the bison documentation. 😉 )
If the action code analyzer determines that you happen to write-access (assign a value) to $$
and/or @$
values before you happen to read-access the same variable in the same action code block (which makes sense if your action code is performing non-trivial operations), jison WILL NOT inject the default action for that action as it would be unused anyway. Hence the analyzer attempts to produce the best performant parser code.
Note that the analyzer is currently still a rather 'dumb' analyzer in that it analyzes your action code using regexes rather than a full JavaScript parse (and AST scan), hence "wicked" action code like this will confuse the analyzer into not injecting default actions before the user-written action code:
function nasty(z) {
$$ = z; // this is analyzed as 'assignment before use' despite actual code flow!
}
var v = $$; // actual code flow: read before write!
nasty(v + $1);
@nwhetsell: ping?
Jison should be backwards compatible again since build 0.6.0-191; see also the description of treatment of $$
(and @$
) in the comment above in this issue.
Yep, this appears to be fixed, thank you!
I’m not sure if I’m doing something wrong, but result locations
@$
seem to be alwaysundefined
in v0.6.0-188. If I runon macOS 10.12.6, the output is