Closed CupOfTea696 closed 3 years ago
I'm not sure this makes sense in mdast specifically.
This document defines a format for representing Markdown as an abstract syntax tree
https://github.com/syntax-tree/mdast#introduction
abstract syntax tree, by design, encode structure, not syntax (which is what a Concrete Syntax Tree would do)
Micromark and CommonMark State Machine (CSM) could enable constructing a concrete syntax tree, and this is noted in the CSM readme:
complete, as it defines different types of tokens and how they are grouped, which allows the format to be represented as a concrete syntax tree
This would likely need a new standard (mdcst?) to capture concrete syntax needs. Since transforms interested in structure (AST), and formatters interested in specific syntax (CST) will have different wants and needs.
Also see previous discussion at https://github.com/syntax-tree/mdast-util-to-markdown/issues/3
@CupOfTea696 If this is needed, you can also use the positional info to access that info by looking characters up in the corresponding vfile!
some of this also discussed here https://github.com/remarkjs/remark/issues/32 and then https://github.com/remarkjs/remark/issues/132#issuecomment-229507086, when remark just made (and still called mdast).
Honestly, I feel that PostCSS and ESTree, which do patch this stuff on nodes, made a mistake: it makes the syntax tree hard to handle
Some more past issues on all this: https://github.com/search?o=desc&q=CST+user%3Amicromark+user%3Aremarkjs+user%3Aunifiedjs+user%3Asyntax-tree&s=created&type=Issues
I’m closing this because I don’t thing such fields should be added to mdast nodes (by default: of course, it’s just json so you can do that yourself if you want). If/when there is a CST version of mdast, it will be a different project, and I’ll make sure to note it here!
Add concrete syntax details
It would be nice to have concrete syntax information on certain nodes, for example, which bullet type was used for a list item.
Problem
When using a markdown parser to modify markdown and write it back to a file, it would be nice to re-use the same style as the original markdown content. Currently, there is no way to get this information to either use inside a compiler or to set the compiler options.
Expected behaviour
Syntax details included in tree Nodes. Below an example for
emphasis
Interface
Markdown:
Yields:
When recompiling the above tree back to Markdown, it would render back to
*alpha* _bravo_
rather than*alpha* *bravo*
, unless the compiler is explicitly set to use a certain character for emphasis.Alternatives
This could be implemented without any compiler modifications by having a utility that detects the used syntax and sets the compiler's options accordingly.