Closed sras closed 6 years ago
No, the parser does not keep tract of this under the hood. Once a section of text is parsed, the original source is discarded. To be able to support such a feature would require a complete rewrite of the entire internals.
As a reminder, Python-Markdown is a relatively old parser. It was developed back when system resources were much more scarce than they are today. Not storing the entire Markdown source was considered a "feature" as it ensured that the parser used less memory. Of course, today that is no longer as much of an issue, but a complete rewrite would be a lot of work for very little gain. I'm not certain, but some of the newer parsers out there may support such a feature.
I would welcome this feature as well. In ReText, there is synchronized scrolling for source and HTML pages, and we currently implement that using a hack. Having that implemented properly in Python-Markdown would be awesome.
From my previous experiments, it would be quite easy for the parser to add line information to the tree, but the preprocessors (especially third-party ones) are the tricky part.
@waylan
In my case, I don't need the mapping to be as granular as possible. So If there is a ul in the generated html, then I only want the ul to be mapped to the entire block in the corresponding markdown. I mean, I don't have to map each li to their source.
So, just mapping top level elements to sections in markdown can work.
I tried splitting the source markdown on "\n\n" and feed each of the items to markdown, and thus obtaining a mapping. But I met with cases like double newlines that may be present in embdedded html etc, So I am looking for a bit more reliable method...
While doing markdown to html conversion, is there any way to attach reference to the corresponding section of original markdown source in every node of generated html?