Footnotes not sequenced / grouped correctly within a plugin

coryschires commented 3 years ago

I'll start by saying this problem is sorta complicated, and I don't expect you to solve it for me. This is really more of a "what do you think is the best approach" type question.

Background

I have a private markdown-it plugin which renders figures (e.g. tables, images, etc) based on JSON data which I pass into the plugin as an argument. Here's a simplified example of what my JSON looks like for a given figure:

let attachments = [
  {
    id: 123,
    kind: 'image',
    url: "https://example.com/image.png",
    title: "Figure 1: Example Title",
  }
]

Also, separately, I am using the markdown-it-footnote plugin as well.

Problem

I encounter problems when trying to add a footnote to the figure title metadata. For example:

let attachments = [
  {
    id: 123,
    kind: 'image',
    url: "https://example.com/image.png",
    title: "Figure 1: Example Title^[Short example footnote]",
  }
]

I would like this footnote to be included alongside the other footnotes in the document. For example, if 3 footnotes appear before this figure, then I would like this footnote to render as number 4. Additionally, I would like this footnote to be listed alongside other footnotes in the footnote list at the end of the document.

Instead, unfortunately, my figure title footnote is rendered in isolation (i.e. it starts at 1 and generates its own separate footnote list). In other words, my figure title footnote seems to have no knowledge about the other footnotes within the document.

Within my extension, I am using the internal rendered to render the figure title:

this.env.md.render(figure.title)

This works perfectly for other use cases (e.g. bold, italic, and even equations via markdown-it-mathjax). But it does not work as expected / desired for footnotes.

Solution?

I could image ways to solve this problem (e.g. post-processing, outside markdown-it where I gather all the footnotes and adjust the HTML as desired).

But before charging ahead with this (or some other) crude solution, is there a better way? Perhaps there's some feature of markdown-it-footnote or the markdown-it internals that would provide a more elegant solution?

Any advice is much appreciated! markdown-it is a great tool!

GerHobbelt commented 3 years ago

Just a thought: I would start by looking into other plugins and how they do their thing. The markdown-it parse() or parseInline() calls, etc. are meant to be used by userland code that takes markdown as a given and a blackbox; you on the other hand are writing a plugin, so there's other stuff to look at instead.

The key then becomes to parse those figure titles while using the very same env environment as used for the main text. Reason being: the footnote plugin tracks the list of footnotes in env.footnotes.list[] and thus SHOULD assign the next id to your title footnote, depending on when you invoke any parse() API in your own plugin.

BTW: You can pass an environment object reference as a parameter to these calls: see the documentation and source code (RTFC 😄)

Just an idea, not tested, so it's into the rabbit hole after this. 🤯

PS: the key is to separate the parse() calls from the render() calls in your mind: first you collect the entire AST (token stream in the case of markdown-it), then only when all is done, you invoke render() with the thusly constructed token stream.

See the code and docu: Renderer.render(tokens, options, env) -> String

Also note how the footnote plugin is organized itself: there's a bunch of handlers for the parse phase, and then there another bunch for rendering the Tokens produced by the parse phase calls. The whole shebang is registered (hooked up) by a set of md.block.ruler... and md.inline.ruler.... calls.

PPS: I'm working on an augmented version of this plugin, where I discovered I would only get my way if I changed the footnote parse behaviour a little by making sure all footnotes are completely decoded (including their 'mode' metadata which is specific to my derivative). The key bits of knowledge gained there were that:

footnote's internal footnote_tail call only executes once all (block+inline) parsing of the input has completed: markdown_it first runs through the input text chopping it up in paragraph level blocks and only invoking the inline parse calls once all blocks have been detected. This makes markdown_it a multipass parser in my book; semantics aside this impacts the order in which footnote assigns id sequence numbers to your footnotes. (See markdown_it source code; look for the class Core definition and the line [ [ "normalize", normalize ], [ "block", block ], [ "inline", inline ], [ "replacements", replacements ], [ "smartquotes", smartquotes ] ] which specifies this behaviour (together with the core code) iff you want to know the details about this behaviour.)

Key realization there is that anything you attach to a phase in that list (e.g. "replacements") is only executed once everybody has had their previous phases' registered callbacks executed, i.e. the markdown_it does a full "normalize" pass, then a "block" pass, then a "inline" pass (on the tokens produced by the "block" pass), and so on. 5 parse passes in all. Before the renderer gets invoked.

Important corollary for you then is this: if you want anything footnote-y to happen to any content you provide through your custom plugin, then you better make sure the footnote plugin gets to see those bits of content in either block or inline parse phase, or both. 😉 Might be very interesting to observe how markdown_it processes ![] image tags therefor, as those also may come with title texts, which MAY contain/reference footnotes and other markdown-y stuff.
markdown_it's core inline handler, responsible for executing the inline chunk parse phase after that block parse action, only looks for inline tokens to process. 🐉 Drat! 👿 The rest is reached only when addressed explicitly by the block-level "plugin" callbacks themselves. That was a bit of nasty surprise for me, as I expected different behaviour before I dived in.

Anyway, check the markdown_it code for how the image handler (look for // Process ![image](<src> "title") in the codebase) does this as that comes pretty close to what you might want to do yourself. Do note that all the activity in there includes a call to state.md.inline.parse(...) which is absolutely critical to make all the registered plugins work on the image title produced by the block-level 'image' parser like they would on any other markdown text. That's how footnotes in image titles automatically get picked up, as will be anything else from any other registered plugin. Technically, this is the image block handler quickly executing a local inline level parse on anything (the title in this case) that will never be touched by the "official" core inline parse phase (unless your plugin happens to spit out "inline" Tokens itself, but that would be truly hacky!)

Key take-away here is: your plugin must have a block level handler a la image processor and that one can then grab your JSON, extract the title and what-not from there and call the inline parser for that one by way of state.md.inline.parse(...) just like the image component does.

In other words: I'ld expect a line with md.block.ruler.before(...) or similar API usage to register a handler of yours in your custom plugin code and the function you register there should execute that state.md.inline.parse(title, ...) sometime somehow.

❗ Oh, do NOT forget to pass the environment to that inline parse call or footnote won't have access to the active environment like you'ld want it.

And heed the comment/warning in the footnote code if you want to use the env environment for your own purposes 😉 :
```
  // inline blocks have their own *child* environment in markdown-it v10+.
  // As the footnotes must live beyond the lifetime of the inline block env,
  // we must patch them into the `parentState.env` for the footnote_tail
  // handler to be able to access them afterwards!
```
If your only worry is footnotes (and other plugins) working in there, than that code comment blurb is not important, but whatever you do, this bit is: here's how to properly invoke an inline parser in your own block level element plugin:
```
  let tokens;
  state.md.inline.parse(content, state.md, state.env, tokens = []);
  token = state.push("custom", "HTMLTAG", 0 /* self-closing, i.e. no start+end Tokens, just this one */);
  token.children = tokens;  
  // ^^^ or whatever you want to do so your own plugin render function can do its job
  token.content = content;
  // ^^^ for diagnostics and general niceness. Not MANDATORY AFAICT...
```
(code ripped and tweaked straight from markdown_it itself; how you collect and store your title's tokens is up to you and your render function, but I would generally follow the same steps as the image processor and the footnote plugin.)
I discovered the quickest way for me to uncover the code flow in markdown_it - when you need to know the internals rather intimately - is to dump a couple of stack traces along the way to get a feel for what is happening when & where exactly by using a dumb bit of code like this, inserted at various spots of interest in the markdown_it code (just edit the files in your node_modules/markdown_it/... tree, you can always rm -rf node_modules/ && npm i to reinstall everything once you're done):
```
try {
throw new Error('x'); // quick & dumb: produce a stacktrace for anyone
} catch (ex) {
console.error(ex);
}
```

Ok, that's my 50 cents. Hope it helps you on your way and have fun!

rlidwka commented 3 years ago

As described above, use state.md.inline.parse.

If you use this.env.md.render(figure.title), you run markdown parser as if figure.title was a completely new document. But state.md.inline.parse continues parsing existing document with old context, that's the difference.

coryschires commented 3 years ago

@GerHobbelt @rlidwka Thanks! This is all very helpful! I'm gonna dig in and see what I can figure out. Based on these comments, I'm feeling confident my problem is solvable without having to do post-processing outside markdown-it.

But it's complex stuff and may take a while to find the best method. I'll follow up on this thread with whatever I learn.

markdown-it / markdown-it-footnote