ocaml / omd

extensible Markdown library and tool in "pure OCaml"
ISC License
156 stars 45 forks source link

Get rid of Link_def node #217

Closed nojb closed 4 years ago

nojb commented 4 years ago

As observed by @shonfeder, the Link_def node is no longer useful in the AST. Let's remove it.

jfrolich commented 4 years ago

Why is it no longer useful? Isn't it good to have the link definitions in the AST to disambiguate between the two uses of links? Especially if we have an AST -> markdown printer (you probably want to preserve the defs in that case).

nojb commented 4 years ago

Why is it no longer useful? Isn't it good to have the link definitions in the AST to disambiguate between the two uses of links?

It is no longer useful because reference links are resolved at parsing time since #209. Since we do not keep a reference to the link definition in the AST, having the link definition node does not serve any purpose.

There is the larger question of how lossy/lossless do we want the AST to be. Lately I have been leaning towards making the AST easier to use (and so more lossy). Preserving link reference and definition nodes allows to reconstruct the original markdown, but makes the AST harder to use, as the user needs to resolve link references.

Especially if we have an AST -> markdown printer (you probably want to preserve the defs in that case).

Indeed, if we add such a printer, it won't be able to recreate link references, and it will just inline the link. I am not too worried about this (but opinions welcome). The only use I can think of for the AST->markdown printer (apart from pretty printing) is for generating markdown programmatically; the use of link references/definitions does not seem to be crucial for this.

nojb commented 4 years ago

As you observed, another case of lossy-ness in the AST is that we do not record which delimiter was used for emphasis/strong emphasis (_ or *) or the number of backquotes used to delimit code blocks (this used to be the case, but added noise to the AST).

nojb commented 4 years ago

223

jfrolich commented 4 years ago

I understand you want to reduce complexity, but the case can be made that link definitions are semantically different (so can be treated differently in a particular printer). About the star or underscore emphasis a case can be made that there is no difference, but it can be pretty handy for a printer to allow supporting a richer feature set while not extending the syntax. These add a bit of complexity to the AST, but it can also easily be ignored by printers?

An addition that I don't really understand is the attributes that are added to the AST.

nojb commented 4 years ago

An addition that I don't really understand is the attributes that are added to the AST.

This is an extension of the standard used by Pandoc which is quite handy to enrich the AST with custom metadata. In the HTML printer, they correspond to usual attributes. See https://github.com/jgm/commonmark-hs/blob/master/commonmark-extensions/test/attributes.md for the spec.

jfrolich commented 4 years ago

Ah ok thanks for referencing, I was wondering where that was coming from. That can definitely be handy indeed.

nojb commented 4 years ago

Thanks for the feedback re: keeping link references in the AST. I want to mull it over, but I see that it could be useful in some cases.