markedjs / marked

A markdown parser and compiler. Built for speed.
https://marked.js.org
Other
32.91k stars 3.38k forks source link

Support for chunked rendering #3315

Closed calculuschild closed 1 month ago

calculuschild commented 3 months ago

What pain point are you perceiving?. Following the PR proposed in Marked-GFM-Headings: https://github.com/markedjs/marked-gfm-heading-id/pull/543, there is a notion of Marked sometimes being used in a "chunked" manner, i.e., parsing individual pages out of a longer document. This can be handy for live-editing services for example, where realtime updating of the rendered output may become slow for very large documents, instead re-parsing only the current page for speed.

Chunking can be done external to marked, by manually handling, say, an array of markdown source chunks and an array of output HTML chunks and updating each array entry as needed. However, difficulties arise when knowledge of the complete document is needed, for example for reflinks, where a link defined in a later chunk may be referenced in a prior one. The PR in Marked-GfM-Headings has a similar intent of ensuring globally-unique header IDs across chunks.

Describe the solution you'd like I don't know exactly how this would work (or if it is already possible), but I'm picturing some kind of method of marking certain data as persistent across chunks, with some knowledge of which chunk it came from so it can be updated as needed. Marked would also need some way to communicate which chunk is currently being rendered.

For example: the lexer.tokens.links object would need an option to be made persistent across chunks, and perhaps label each link entry with a chunk ID so if that same chunk is re-rendered (perhaps a reflink is removed), it can be properly updated for the global list.

It would be nice to have a set template for this type of object so extensions, etc. can have a common interface for cross-chunk support.

UziTech commented 3 months ago

Maybe we could add a hook that can change the Lexer before it is used.

let links;

processLexer(lexer) {
  lexer.tokens.links = links;
  return lexer;
}
processAllTokens(tokens) {
  links = tokens.links;;
}