kach / nearley

📜🔜🌲 Simple, fast, powerful parser toolkit for JavaScript.
https://nearley.js.org
MIT License
3.63k stars 231 forks source link

Context objects during parsing #326

Closed dashie closed 7 years ago

dashie commented 7 years ago

I'm using nearley to parse VERY LARGE files (GB). So I don't want to construct the tree, I just want to parse and chunk by chunk eval my data and import them.

tjvr commented 7 years ago

I’m afraid I don’t understand what you mean by “chunk by chunk eval”.

Unless there are natural break points in your files, that you can detect before parsing—eg a new line token to separate unrelated parses—I don’t see how you would “split up” parsing.

You almost certainly want to use a tokenizer (moo) for this. You might be better off not using nearley at all, depending on what you’re trying to parse, and how complicated the syntax is.

I’m not sure what you mean about context, either :-) Perhaps you should describe your use case in more detail?

Sent with GitHawk

dashie commented 7 years ago

Imagine a situation like this:

INSERT INTO table (id, name, phone, date) VALUES
  (1, 'Phil', '444444444', NULL),
  (2, 'John', '333333333', NULL),
  (3, 'Anne', '777777777', NULL),
  ...
  10.000 rows
  ...
  (99999, 'Zoe', '111111111', NULL);

I don't want to parse and create a tree of my data, but to create a processor based on a grammar, that fires an event and call a custom callback for example on every VALUE row, with some custom data as context like the info that I parse before in the INSERT INTO row.

But there are many other situation where I need to use a grammar to create a processor that react to a text and does not simply construct a tree.

I did many times processors like that using Javacc. Can I do the same with nearly? Maybe it's my fault that I don't know very well the library.

dashie commented 7 years ago

Another example about the possibility to use a context object in postprocessor. I would like to attach an custom object to the parser, in this way:

const parser = new nearley.Parser(myGrammar);
parser.ctx = myCustomObject;

and to be able to access this object in the post processor

function postprocessor (d,pos,err,ctx) {
    ...
    ctx.doSomething(d);
}

in do something for example I can start to import data in the db, or other stuff.

dashie commented 7 years ago

Here is an update with example. Look at this branch and try

node bin/nearley-test.js -q examples/js/events-stream.js < examples/events-stream.data

you can see that:

tjvr commented 7 years ago

I’m sorry, but I don’t think you can use Nearley to do what you want. Nearley uses the Earley algorithm, which brings many of the benefits listed in the readme; unfortunately for your use case, Earley is a top-down parser, not a bottom-up one.

If you’re interested, you might find hardmath123’s blog post introducing the algorithm interesting: https://hardmath123.github.io/earley.html

Sent with GitHawk

volkanunsal commented 5 years ago

@dashie We had a similar usecase. This patch worked for us. I haven't tested it with your usecase, though. It seems more complicated than mine.

// nearley.js:82
State.prototype.finish = function() {
  if (this.rule.postprocess) {
    // Pass context as the 4th argument
    this.data = this.rule.postprocess(this.data, this.reference, Parser.fail, this.context);
  }
};

// grammar.ne
main -> "foo" {% (d, _1, _2, c) => c.addVertex(d) %}

// nearley.js:313
var next = state.nextState({
  data: value,
  token: token,
  isToken: true,
  reference: n - 1,
  // Add context to the state
  context: this.options.context || {}
});

const g = nearley.Grammar.fromCompiled(grammar)
const context = { addVertex: (a) => { ... } }
// Pass context in the options.
const opts = { context }
var parser = new nearley.Parser(g, opts);
parser.feed("foo");