Closed jsm28 closed 8 months ago
Thats a great contribution, thank you! I only removed the looping over the text with a re.sub, hope I did not introduce more complexity with that.
This is a bad change and bad code that breaks a variety of things; it feels a lot like over-escaping.
E.g. even regular links like this https://admin.microsoft.com/AdminPortal/Home#/partners/invitation/granularAdminRelationships/ffffffffff-4440-40e3-a609-ffffffff-ffffffffff-ef31-4ef0-a6b8-ffffffffff
now become completely unusable – though no markdown rendering engine would interpret any of the characters of markdown-significance https://admin.microsoft.com/AdminPortal/Home\\#/partners/invitation/granularAdminRelationships/ffffffffff\\-4440\\-40e3\\-a609\\-ffffffff\\-ffffffffff\\-ef31\\-4ef0\\-a6b8\\-ffffffffff\\
Especially setting escape_misc
to True
by default(!) and breaking already-implemented production usages of this library is an absolute faux-pas.
It is so bad that just 13 days after this PR was handed in, a new PR https://github.com/matthewwithanm/python-markdownify/pull/122 was submitted to fix what this one so heavy handedly "implemented".
Agree with @alfonsrv, the default should be set to False.
There are many punctuation characters that sometimes have significance in Markdown; more systematically escape them all (based on a new escape_misc configuration option).
A limited attempt is made to limit the escaping of '.' and ')' to the context where they might have Markdown significance (after a number, where they can indicate an ordered list item); no such attempt is made for the other characters (and even that limiting of '.' and ')' may not be entirely safe in all cases, as it's possible the HTML could have the number outside the block being escaped in one go, e.g.
<span>1</span>.
.Fixes #99