Closed atakiel closed 4 years ago
Personally I would prefer the system that has been used for quite a while now in pandoc, which doesn't use XML elements. This would use something like
::: sidenote
Here's the side note.
:::
More fully:
[Inline text with arbitrary attributes]{#identifier .class key=value}
::: {#identifier class key=value}
Block-level text with arbitrary attributes.
1. one
2. two
:::
or if you just want a class:
::: sidenote
Hi there
:::
A filter can then be used to define a special meaning for a native Span or Div with certain attributes. (For custom elements containing verbatim text, code spans and code blocks with attributes can be used, similar syntax.)
This is more "markdownish," in that it looks less marked-up.
My commonmark-hs project also includes an extension to attach attributes to arbitrary block-level content, so you can just do this:
{.sidenote}
Here is my side
note.
I totally forgot the main reasons why I personally would prefer mdx or other xml (like) syntax over the competition.
Core markdown is relatively simple syntax to learn and use. Even so for a non-techie persons.
I've always considered this to be one of the major reasons markdown has accumulated as much popularity as it has done, even though, the competition would have provided more firepower for the more experienced user base.
I think it's a strength of markdown, one that should be kept a priority also in future markdown development - the core should stay as simple and easy to use for as wide audience as possible.
HTML in markdown has always felt like an advanced use case. Back when I was learning to code and to use markdown, when I saw html in markdown document, one not authored by me, it kind of felt like something that would be wise not to touch.
Similarly, when I now see html inside markdown document, it's immediately clear, that the part containing html is doing something tricky, and I should be wary that something unexpected might happen with that part.
I think this distinction would pay to the benefit of using mdx like syntax for extensions, for those are also advanced use cases.
With mdx style syntax for extensions, the core markdown syntax would stay simple.
Xml style syntax would also provide the benefit that anyone that is familiar with xml is already familiar with using them for custom extensions. If they have learned to read xml/jsx/html, they probably can easily read it also inside markdown.
The amount of new things to learn would be smaller, when a previously used syntax is used.
The boundaries are very clear in xml syntax. Also, there's an extension hierarchy prebuilt in xml syntax.
You can easily embed more xml elements inside other xml elements and see that they are inside each other, and not, say, sibling elements:
<parent>
<child></child>
</parent>
vs
<parent></parent>
<child></child>
Another thing I forgot:
Fallback action for extension element that cannot be resolved to an component implementation should be to not render the extension element or its contents.
Lets say an extension component would be used in news industry tool, e.g. for in a use case similar to comments or deletions in critic markup.
Then the news papers chief editor, using the tool, would have written a comment, claiming that part of the information in the surrounding text should be reducted to not contain some information. This could happen e.g. for privacy reasons. Maybe the extended comment element would have a persons name.
It would be super important, for the information inside that extended comment element to never reach public.
e.g.
President Trump did X while firing his Y assistant.
<EditorialComment>
We can't say the previous phrase like this,
because it would imply the name of John Smith.
</EditorialComment>
Previously Trump has ...
Then again, maybe the fallback for a missing extension could be to use a provided missing extension component.
This could be an implementation detail.
Note that
<EditorialComment>
We can't say the previous phrase like this,
because it would imply the name of John Smith.
</EditorialComment>
currently gets parsed as a raw HTML block by conforming commonmark parsers. If one uses a parser (like commonmark.js or cmark or pandoc) that creates an AST, then one can walk the AST and transform the block as you see fit. So this kind of customization is already possible. Note, however, that the interior of the block will not be parsed as commonmark, but as literal text. If you want it parsed as commonmark, then put blank lines between the opening tag and the content, and between the comment and the closing tag, and have your filter intercept the opening and closing tags.
Hey folks, lovely discussion!
I’ll reply to your MDX issue as well, but in short, a couple clarifications. Disclaimer: I help maintain (but don’t develop) MDX (it’s one of the ways I get funded, more here):
@atakiel:
I think there would be great synergy found merging some of the work done in these two projects
I’m up for working together. I believe John Otander is as well. However, similar to how GFM sits on CM, MDX also does that, I’m :+1: on links but not sure about merging
E.g. in a wysiwyg editor
— You should check out the work John Otander is doing with Blocks and also check out Chris Biscardi’s Sector Tools
File extension would still be
.md
.
I personally don’t like that idea
@jgm:
The generic blocks can represent everything that can be represented in the XML MDX syntax. Nothing prevents you from creating a generic block that does something "complex and interactive." The complex and interactive thing is going to be inserted by a tool that consumes the parsed md document and replaces the generic block with something else. (Just as with MDX.)
If you‘re going to do AST transforms / injecting complex interactive things, then you can do that with HTML (<iframe ...>
), generic directives (:::youtube ...
), and MDX (<Youtube ...>
), all the same. Whichever you prefer. But I don’t think generic extensions replaces all of HTML, right? They’d live together.
MDX replaces both what HTML and generic directives do and has some JS programming.
MD is relatively easy*, HTML/generic directives/MDX are hard, but sometimes needed. MD is nice because “make the easy things easy, and the hard things possible”. JSX is powerful & good at hard things. HTML/XML are clearly non-markdown, it looks hard and is hard. Directives look a bit like Markdown, but have a complex syntax, they look simple but aren’t.
But, again: I see both existing, they’re tools: they all have up- and downsides.
* — easy/hard are subjective of course, but here I’m trying to express what someone experiences who may get link brackets and braces confused ((asd)[url]
)
No matter how you notate generic directives, you need to specify somewhere what to do with them, and that's going to involve programming. MDX is just a particular syntax for generic directives, processed by JavaScript. Certainly nothing JavaScript-specific should be part of the commonmark spec. One can argue about syntax for generic directives, but I think that goes elsewhere, so I'd recommend closing this issue.
Sorry it took me while to reply. I got an acute case of shyness.
Some of my thoughts on points made above:
This is more "markdownish," in that it looks less marked-up.
One point that I'm trying to push with this issue (although only very much implicitely in the original issue text) is that markdown is used by lot of non programmers. My own empirical take on this is that the popularity is mainly due because the core is very simple, yet very powerful. You can do most of the things you need to do in a typical post, comment, or other simple everyday text use case with markdown. Adding a syntax for extensions that looks "markdownish", but is actually highly complicated, is in stark contrast to this feature of markdown. IMHO this is the most important feature of markdown / commonmark.
Instead a non "markdownish" syntax that extends markdown, gives the lay user the idea that they are looking at something far more complex use case, and that it is totally okay for them not to understand it, or use it. But it also says that they can keep using the core markdown syntax, even if they don't know the advanced syntax.
If the advanced syntax would look like the basic syntax, the lay user would find it hard to understand what are the safe parts of the language.
So there is essentially a UX question at it's core.
I'm not saying there should be only one syntax for extending commonmark, but if there would be, it shouldn't do damage to markdown's ease of use. Because if things are possible to be done in a hard way, they will be done in a hard way. And when you have more options than you need, and the simple thing becomes complicated, it can lead to a burnout, and that again into not doing or using something.
I’m up for working together. I believe John Otander is as well. However, similar to how GFM sits on CM, MDX also does that, I’m 👍 on links but not sure about merging
Merging is probably totally wrong word in this context. What I was trying to imply, was that commonmark's general directives could take inspiration from the syntax of mdx (sans js) and mdx could work it's way away from js into being language agnostic. There could be kind of same target, towards with both projects could move on their own pace.
MDX in Markdown replaces HTML in Markdown: You can’t have both. Adding this would break all current Markdown that has HTML.
This is a good point. I take you mean that what is valid html might not always be valid jsx, or vice versa?
The generic blocks can represent everything that can be represented in the XML MDX syntax. Nothing prevents you from creating a generic block that does something "complex and interactive." The complex and interactive thing is going to be inserted by a tool that consumes the parsed md document and replaces the generic block with something else. (Just as with MDX.)
This is also a good point.
MDX is just a particular syntax for generic directives, processed by JavaScript.
One point I'm trying to push here is, eventhough mdx is currently tied to javascript, that wouldn't have to be the case in future.
The power of jsx comes from the general idea that a element declaration in jsx is equivalent to a function call with named parameters.
The language in which the function is declared doesn't have to be javascript, even though current implementations are in javascript. It should be enough that the language allows use of dictionaries or supports key value pairs in some other way as arguments.
Certainly nothing JavaScript-specific should be part of the commonmark spec.
True. Commonmark should be language agnostic. Didn't try to imply that any javascript should be used inside commonmark.
One can argue about syntax for generic directives, but I think that goes elsewhere, so I'd recommend closing this issue.
If this a wrong place, then its fine by me to close the issue.
I think you make an interesting point in saying that there is an advantage if the extensions are marked up in a way that is obviously NOT markdownish. Haven't heard that argument before.
an advantage if the extensions are marked up in a way that is obviously NOT markdownish
I'm not sure I buy that there is an advantage. @atakiel, can you provide a real world example of some extension that
gives the lay user the idea that they are looking at something far more complex use case, and that it is totally okay for them not to understand it, or use it. But it also says that they can keep using the core markdown syntax, even if they don't know the advanced syntax.
Is it necessary or desirable for Markdown to be "Turing complete", so-to-speak? Is the point of Markdown to be a higher level "language" that compiles down to HTML so that you can use it instead of HTML, necessitating it to be "HTML complete", i.e. able to express everything HTML can express?
I very much don't think so. I think Markdown is successful precisely because it is a great application of the Pareto Principle. It covers at least 80% of use cases (by frequency, not problem space) with 20% of the complexity. Yes, we can extend it to cover 100%, but then it turns into HTML, which we already have. The only reason to pursue that, in my opinion, is if people believe HTML is a bad solution begging for replacement.
I do believe it is worth adding new syntax to Markdown, but it should remain Markdown-like, in the ways that @jgm describes. I think that can take Markdown from 80% to 90 or 95%. But the goal should not be "% HTML Complete", but how much expressivity can we add without undermining what Markdown is and what made it so successful.
The proposal to extend Markdown with MDX syntax boils down to embedding MDX or MDX-like complex content in a Markdown document. But by embedding the complex into the simple, the simple becomes complex.
Wouldn't it make more sense to invert the relationship? Shouldn't MDX or HTML support embedding Markdown?(1) The simple gets to remain simple, and the complex gets a little simpler, at least in the regions of Markdown within it.
In fact, MDX embedding Markdown will look almost exactly like Markdown embedding MDX. Except in the former case Markdown gets to stay simple.
Isn't this the best way not to confuse "the lay user"?
(1) That's a bit of a rhetorical proposal, as HTML, via Javascript, already supports embedding Markdown. Though conceivable we could standardize on a <Markdown>
HTML tag that every browser understands without needing any Javascript.
Anyway, this kind of discussion belongs on the forum (talk.commonmark.org) rather than here. So, closing.
Markdown competitors provide a system for extending markdown documents with user provided custom extensions.
Markdown could have a syntax for custom user provided extensions as well.
Proposal
Add a syntax for custom user provided extensions in CommonMark. A good candidate for such syntax would be MDX.
Currently, in the js world, MDX-JS is doing work that could show a path for user provided extensions usable in the greater context of markdown, in a future-safe manner (that could not result in conflicts between extensions and future specification based markdown features).
MDX uses JSX elements to mark custom components, to be rendered with a jsx compatible framework (using e.g. react or vue) alongside the rest of the markdown document.
As they currently work, MDX elements act as user made extensions in a markdown document. The mdx elements are rendered using a spesific component implementation matching those MDX elements. The problem is, MDX-JS only works in the js environment.
Because mdx-js is currently very much tied to the js world, there would be need to split mdx-js into two parts. Universal language agnostic specification for markdown extensions (MDX), and js specific specification for augmenting mdx documents with js (MDX-JS).
I've created an issue depicting this proposal from MDX-JS's point in MDX-JS's specification repository (https://github.com/mdx-js/specification/issues/25).
I think there would be great synergy found merging some of the work done in these two projects (Commonmark and MDX) together.
How it would work
In a markdown you would add a custom extension element just like you would use html elements:
Perhaps in a simple js implementation targeting html, it would result in the following html:
In another render target, say pdf, the same could be rendered into a red block with a fitting icon and the contents of the mdx element inside it.
Perhaps yet another implementation for the mdx element could be used, e.g. in a text processing subsystem, to do some processing to its contents, before the final rendering.
Declaration of used MDX element implementation
There would need to be some way of declaring the components implementing the MDX elements, but this could probably be left for implementations to decide.
As it is, MDX-JS currently uses inlined js to define the implementations inside the markdown document, but this becomes problematic even in the js world, if there are multiple different render targets used for the same document. E.g. in a wysiwyg editor, the editor view and the preview view could use different implementations for the same MDX element, so the implementation should be defined outside the markdown document.
Use of inlined target language is not bad in MDX-JS's current use case, where it is very much beneficial in many cases (e.g. in gatsby of mdx-deck), but commonmark spec probably shouldn't make it allowed explicitely. It should probably be implementations decision. or then there could be distinct syntax for adding inlined evaluated code in a specific language.
File extension and fallback default processing
File extension would still be
.md
.Default fallback for finding an MDX element without a declared matching implementation, would be to either disregard it or act like it would be a regular html element. Wouldn't this be what would happen in most of the markdown implementations as it is?
Benefits of MDX style markdown extensions
Future-safe syntax for extensions
Using JSX syntax for custom extensions provides that there are no future syntax clashes between custom extensions and future commonmark feature syntax.
Syntax similar to the already allowed HTML
JSX like syntax also fits nicely with the existing use of HTML syntax inside markdown documents.
In JSX syntax, HTML elements start with a lower case letter, while custom elements start with a capital letter. This would also allow HTML custom elements to co-exist alongside MDX elements.
Name - markdown extensions (MDX)
As an added bonus, even the name would be great, as it could be read as "markdown extensions".