kach / nearley

📜🔜🌲 Simple, fast, powerful parser toolkit for JavaScript.
https://nearley.js.org
MIT License
3.57k stars 231 forks source link

Output is just a nested array of characters #645

Open cocosbeans opened 5 months ago

cocosbeans commented 5 months ago

I'm not the sharpest tool in the shed, and I just picked up Nearley.
Here's what my grammar.ne is:

@builtin "whitespace.ne"
@builtin "string.ne"
@builtin "number.ne"

identifier -> dstrchar:+
type -> "strl"
      | "str" "[]":?
      | "int" "[]":?
      | "bool" "[]":?

bool -> "true" | "false"
value -> int | dqstring | bool

assign -> type __ identifier _ "=" _ value _ ";":?

Here's my index.js:

const nearley = require('nearley')
const grammar = require('./grammar.js')
const parser = new nearley.Parser(nearley.Grammar.fromCompiled(grammar));
parser.feed("int x = 5;")
console.log(parser.results)

And, for some reason, here's my output:

[
    [
        [
            'i', 'n', 't', ' ',
            'x', ' ', '=', ' ',
            '5', ';'
        ]
    ]
]

What's the issue? This is just a quick test by the way, I haven't decided what I want to start working on
Edit: I made a typo in the Issue.

TekuConcept commented 3 months ago

This is by design. Each character is being treated as its own "token," so you basically have an array of matching tokens in sequence. To concat these tokens into a string, you can use embedded javascript. For example:

# This will match an identifier such as `fn_foo2bar`
IDENTIFIER -> [a-zA-Z_] [a-zA-Z0-9_]:* {%
    d => ({
        type: 'identifier',
        value: d[0] + d[1].join('')
    })
%}