miyuchina / mistletoe

A fast, extensible and spec-compliant Markdown parser in pure Python.
MIT License
811 stars 113 forks source link

Record line numbers on tokens #144

Open djmattyg007 opened 2 years ago

djmattyg007 commented 2 years ago

I'm writing a tool to parse markdown files and verify that links are valid. To be able to provide the most valuable, accurate feedback, I need to be able to display line numbers to the user, so that they know exactly where in a file the broken link is.

Right now this isn't possible, because line numbers aren't recorded on token objects.

pbodnar commented 2 years ago

@djmattyg007, I think that what you propose would be a pretty cool feature. In principal, it shouldn't be hard to implement, yet it would probably require some broader refactoring in order to track & store the context information (line, plus column would be also nice). I would also watch out for possible impact on performance, but I don't expect some human-noticeable difference...

I would like to look at this one day, but I don't have much time these days, so, as usually, PRs are welcome in the meantime. :)

anderskaplan commented 1 year ago

Hi, I have a somewhat similar use case, where I'd like to know the position of translatable content in the Markdown document.

I've started on an implementation. It turns out that block tokens are fairly straightforward, but span tokens are more difficult.

@djmattyg007: If you could get the starting line number of the paragraph containing the link, would that be of any use to you? Or do you need the line number of the link?

djmattyg007 commented 1 year ago

I need the line number (and ideally the offset on the given line).

anderskaplan commented 1 year ago

Ok, I have a PR in the works for line numbers on block tokens which may or may not be useful for you. But it depends on #172, so it will have to wait until that one is merged.

pbodnar commented 1 year ago

@anderskaplan, while implementing the MarkdownRenderer recently, haven't you investigated this feature request more? I think it would be pretty cool and possibly another distinguishing trait of mistletoe? :)

pbodnar commented 9 months ago

Thanks to @anderskaplan, we've got this implemented for block tokens in #188. It still doesn't fully cover the original request though (finding out line number for a any, notably span token), so I will leave this issue open for now.