Closed Lordfirespeed closed 1 year ago
@Lordfirespeed, good catch, thank you. Would you mind to also add a unit test on this? I'm a little bit struggling to find a use case where a Fragment
would be created with text
including \n
, i.e. when the problematic part of code would be called "in real life". (Or maybe @anderskaplan as the author of MarkdownRenderer
will know? :))
My case is a bit of an odd one, apologies!
I wanted to parse LaTeX so that I could use regex inside latex blocks to replace command sequences e.g.
\R
-> \mathbb{R}
\rArr
-> \Rightarrow
Because I was trying to migrate my notes from KaTeX to MathJax.
I 'wrote' this very basic class:
from mistletoe.markdown_renderer import MarkdownRenderer
from mistletoe.latex_token import Math
from itertools import chain
class MathMarkdownRenderer(MarkdownRenderer):
def __init__(self, *extras, **kwargs):
super().__init__(
*chain(
(Math,),
extras,
),
**kwargs
)
def render_math(self, token):
return self.render_raw_text(token)
And used it like so:
from mistletoe import Document as MarkdownDocument
from math_markdown_renderer import MathMarkdownRenderer
with MathMarkdownRenderer() as renderer:
with open(file, "r") as infile:
document = MarkdownDocument(infile)
# I made changes to the parsed document here
render = renderer.render(document)
I found that multiline LaTeX would almost always contain newline characters, e.g. the following:
$$
a=b
$$
would yield a Math token with the following content:
"$$\na=b\n$$"
So, in short, yes, I can add a unit test; but not without also adding functionality to the library 😅
Good catch!
The easiest way to trigger the bug in a unit test would be to modify one of the existing tests. For example:
def test_images_and_links(self):
input = [
"[a link](#url (title))\n",
"[another link](<url-in-angle-brackets> '*emphasized\n",
"multi-line\n",
"title*')\n",
'![an \\[*image*\\], escapes and emphasis](#url "title")\n',
"<http://auto.link>\n",
]
output = self.roundtrip(input)
self.assertEqual(output, "".join(input))
(The only change is that the line "multi-line" has been added. The bug will only trigger when there are more than two rows.)
@anderskaplan, thanks a lot, great to have you back! :)
@Lordfirespeed, so I guess you can just add "multi-line\n",
to the existing unit test as Anders suggests, and that's it. I would merge the PR afterwards.
@anderskaplan, thanks a lot, great to have you back! :)
@Lordfirespeed, so I guess you can just add
"multi-line\n",
to the existing unit test as Anders suggests, and that's it. I would merge the PR afterwards.
I'm not sure that particular example would cause the bug to happen as [0]
and [-1]
are both handled specifically, so the multiline fragment needs to contain at least two newlines - but I get what you mean! I'll get on that.
Edit: Oh! sorry, never mind. I was confused.
All done! I have not confirmed the unit test functions as intended as I don't have a codespace open at the moment.
Thank you!
When using the Markdown renderer, if a fragment containing newline characters is encountered, it is split by
\n
and each line is yielded in turn.However, with a quick check in the python REPL we find:
Notably,
d
, the penultimate element, is being skipped. This can be fixed by changing the bounds of the 'inner' slice from[1:-2]
to[1:-1]
.