utsingh / pagedown

Automatically exported from code.google.com/p/pagedown
Other
0 stars 0 forks source link

In-word em/strong not parsed correctly #57

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
> c.makeHtml( 'f**o**o' )
actual: '<p>f*<em>o</em>*o</p>'
expected: '<p>f<strong>o</strong>o</p>'

> c.makeHtml( 'f*o*o' )
actual: '<p>f*o*o</p>'
expected: '<p>f<em>o</em>o</p>'

Original issue reported on code.google.com by pkoszulinski on 13 Apr 2013 at 7:24

GoogleCodeExporter commented 8 years ago
That's intentional. See 
http://blog.stackoverflow.com/2008/06/three-markdown-gotcha/ and 
http://blog.stackoverflow.com/2009/10/markdown-one-year-later/

Original comment by b...@stackoverflow.com on 13 Apr 2013 at 7:28

GoogleCodeExporter commented 8 years ago
I was afraid that this is intentional ;( I've seen Jeff's post about that 
looking for others which agree with me that MD is f***ed up.

Few weeks ago I started a small project - markdown plugin for CKEditor which 
will produce MD output and will parse MD input. So basically I need to 
implement HTML->MD converter which will cooperate with some of existing MD 
parsers. I thought that this will be easy, because thanks to CKE's tools I've 
got a beautiful lightweight-DOM representation of data I need to convert and 
being CKE's core dev a lot of experience with processing HTML.

The only assumption I made is that I won't mix up HTML with MD because 
otherwise I could produce HTML and say it is MD... :D The only reason I see MD 
useful in CKE's case is that editor's output couldn't contain malicious HTML 
(taking into consideration that assumption I made).

I haven't even started for good and I realised that because of MD's poor 
specification, number of quirks, number of versions of every format (I wouldn't 
be surprised if link can be formatted in more than 10 ways), dialects, varying 
parsers and so on this will be a nightmare :) And it is...

Regarding this case - what's the MD's equivalent of <p>f<em>o</em>o</p>? Maybe 
I'm missing something, but I can't find any.

I have to rethink if it makes any sense to create such a plugin for CKEditor. 
But I'm afraid that it does not, because I would have to make my own MD fork 
first (standardized, simplified and HTML compliant) what's ridiculous.

Original comment by pkoszulinski on 13 Apr 2013 at 8:15

GoogleCodeExporter commented 8 years ago
Since Markdown explicitly allows intermixed HTML, f<em>o</em>o *is* legal 
Markdown -- and if you want in-word emphasis with our flavor of Markdown 
(GitHub does the same thing, by the way), that's the only solution.

If you want to patch pagedown to change this behavior, the relevant regexes are 
in _DoItalicsAndBold: 
https://code.google.com/p/pagedown/source/browse/Markdown.Converter.js#1087

Original comment by b...@stackoverflow.com on 13 Apr 2013 at 8:23

GoogleCodeExporter commented 8 years ago
I know that f<em>o</em>o is legal Markdown. Unfortunately, in my case MD makes 
sense only if HTML is not allowed, because then backend guys can take that MD, 
use some backend MD2HTML converter and safely use that content on their page.

Anyway, I didn't want to argue what makes sense and what does not, because it's 
too late. And I cannot fork MD too cause then none of other parsers (available 
for languages used in backend) will be compliant with mine :D I just needed to 
complain a little. Sorry :)

Original comment by pkoszulinski on 13 Apr 2013 at 8:36