Definition nodes - Githubissues

puzrin commented 7 years ago

Problem:

Definitions can be located in ANY part of document.
Those are global.
Quick access would be useful in AST -> HTML writer.

How to store?

TBD.
Need to be in-place to fast markdown write. But need some additional global reference for fast search.

Checklist

Duplicated definitions should be stored right
HTML writer should avoid AST scan

geyang commented 7 years ago

Does definition nodes include things like front-matter or foot-notes?

Right now my usage for front-matter is: run markdown-it parser once, get options, pass options back (like some math config), run again.

What would be nice to have is instead:

do a quick parse with only level-0 (front-matter only parse),
take the front-matter info and parse again (in full-depth).

and if the partial AST from the first partial parse step is returned, whether have quick node look up is not a feature that I must have. This feature is something you can always add slowly later.

I think

you have two choices:
- either just always require a second pass or
- storage in global during the first pass (equivalent to an index).
looping through things is not that bad if the logic is just a single line.
differential parse is not a necessary requirement for markdown parser. global parse with a depth limiter is good enough for fast pass/partial/incremental parse. Trying to do things incrementally inside markdown-it might add unnecessary complexity, and there are better libraries that are really good at doing that for various types.
quick HTML writer is also not a huge requirement. Reality in the front-end is that re-paint in DOM update is taking so much time, that it makes sense to
1. first do a virtual dom/html fragment DOM-diff
2. then apply the diff'ed DOM
  
  just because it saves repaint time. There are well-established DOM and JSON diff algos. We don't have to do it here.

markdown-it is fast, and fast enough. To do incremental parse you can always add a depth-limiter I think

Another interesting thing about LaTeX is that LaTeX by definition is a macro-language. Which means that definitions do not have to be unique (@dpvc might have more insight, from math-jax's equation reference it looks like some of the global references do need to be unique). The way math-jax handles repetitive definition is to return an error. having a good error format might be just as important as having good global storage format.

Again, I would support a pure AST tree, no fast look up, with an optional global env object that you return that the user can use to access globals.

puzrin commented 7 years ago

I mean http://spec.commonmark.org/0.27/#link-reference-definitions

Frontmatter-like things are just headers. Those can be processed via regexp without running markdown at all.

geyang commented 7 years ago

Frontmatter-like things are just headers. Those can be processed via regexp without running markdown at all.

Right, but from the user's perspective (my humble perspective really XD) Those are things I see through markdown parser. Compared with having regex parsing those, it is a lot better for me to use markdwon-it plus front-matter plugin to get it more professionally. This is why it is also nice to have a depth limiter to make this initial parse fast, because I can assume that markdown-it does it correctly. Like the saying you shouldn't parse HTML using regex, same applies here.

puzrin commented 7 years ago

Right, but from the user's perspective (my humble perspective really XD) Those are things I see through markdown parser.

In my eyes this looks as attempt to use low-level instrument for high-level task. If users need to extract metadata, they should do wrapper, which:

extract and parse metadata (no markdown parse needed here)
cut input head and feed the rest to markdown-it

I don't see any problems with depth limiter here. Also, depth limits depends on algorythms. If you inspect reference implementations, you will see it has no problems with depth at all (but it's not pluggable).

puzrin commented 7 years ago

Anyway, let's keep this issue for main topic - how to store links definitions in document.

markdown-it / markdown-ast-spec

Definition nodes #2