jonschlinkert / remarkable

Markdown parser, done right. Commonmark support, extensions, syntax plugins, high speed - all in one. Gulp and metalsmith plugins available. Used by Facebook, Docusaurus and many others! Use https://github.com/breakdance/breakdance for HTML-to-markdown conversion. Use https://github.com/jonschlinkert/markdown-toc to generate a table of contents.
https://jonschlinkert.github.io/remarkable/demo/
MIT License
5.75k stars 373 forks source link

chore: create AST #231

Open jonschlinkert opened 8 years ago

jonschlinkert commented 8 years ago

This is more of a reminder to myself, unless someone wants to do a PR. Currently, the parser is more of a lexer/tokenizer that returns a token stream. I can expose a method for creating an AST.

Something like:

var Remarkable = require('remarkable');

function ast(str, options) {
  var md = new Remarkable(options);
  var tokens = md.parse(str);

  var node = {type: 'root', nodes: []};
  var nodes = [node];
  var stack = [];

  var len = tokens.length;
  var idx = -1;

  function last() {
    return stack.length ? stack[stack.length - 1] : nodes[nodes.length - 1];
  }

  while (++idx < len) {
    var tok = tokens[idx];
    var prev = last();

    if (isOpen(tok)) {
      var token = {type: toType(tok), nodes: [tok]};
      prev.nodes.push(token);
      stack.push(token);
    } else if (isClose(tok)) {
      var parent = stack.pop();
      parent.nodes.push(tok);
    } else {
      prev.nodes.push(tok);
    }
  }

  return node;
}

function isOpen(tok) {
  return /_open$/.test(tok.type);
}

function isClose(tok) {
  return /_close$/.test(tok.type);
}

function toType(tok) {
  return tok.type.replace(/_open$/, '');
}

console.log(ast('# Foo\nbar\n> > foo'));

Not sure if this works, but it's probably close.

tomByrer commented 7 years ago

If you/someone does create an AST, would it be beneficial to be compatible with MDast? Or are you guys too different?

jonschlinkert commented 7 years ago

The mdast AST is more like a cheerio or HTML AST, I'm not a big fan of that. I'd like to keep ours more elegant and simple following snapdragon's conventions. For example, with snapdragon the AST itself is "just another node". All nodes are the same, and all nodes follow the same conventions.

edit: you could easily create a mdast renderer or parser with snapdragon too...

edit2: to be clear, I love cheerio and use it a lot - I just meant the HTML-ish AST (which is beyond cheerio's control. cheerio makes that easy to use. but with markdown or anything else, why make the AST more verbose than it needs to be?)