Python-Markdown / markdown

A Python implementation of John Gruber’s Markdown with Extension support.
https://python-markdown.github.io/
BSD 3-Clause "New" or "Revised" License
3.74k stars 858 forks source link

Support issue: annotating output tags with data attributes specifying the filename/line(s) responsible for the html output #1235

Closed eddiezab closed 2 years ago

eddiezab commented 2 years ago

I'm looking to understand if there is a current option or approach for referencing back to the source markdown from the output HTML.

For example something like:

# Heading 1

> To be or not to be
> that is the question

might render out to something like this:

<h1 data-source-file="/tmp/demo.md" data-source-line="1" >Heading 1</h1>
<blockquote data-source-file="/tmp/demo.md" data-source-line="3" >
<p data-source-file="/tmp/demo.md" data-source-line="3" >To be or not to be
that is the question</p>
</blockquote>
facelessuser commented 2 years ago

I don't think the parser allows for accurate line tracking in any real way. On top of that, Python Markdown is file agnostic. It mainly operates on buffers not files. You can certainly use it from the command line and give it a file, but that just extracts the content and runs the buffer through the API, the API doesn't care about where the buffer came from.

waylan commented 2 years ago

This is absolutely not possible. In addition to the issues mentioned by @facelessuser, there are the preprocessors, which remove various types of content (such as link references) from the buffer before passing it to the block parser. Therefore, by the time the block parser receives the buffer, the lines would not match the lines from the source file. Finally, the block parser steps through the lines by popping them off the top until the buffer is empty. Therefore, as far as the parser is concerned, it is always operating on line 1.

In short, tracking the source lines was never a concern when the parser was designed. Therefore, it would be impossible to add support without completely replacing the parser with a different one.