michelf / php-markdown

Parser for Markdown and Markdown Extra derived from the original Markdown.pl by John Gruber.
http://michelf.ca/projects/php-markdown/
Other
3.42k stars 530 forks source link

Encoding problems and no_entities #173

Open markseuffert opened 10 years ago

markseuffert commented 10 years ago

Hi,

I was testing how to prevent HTML code in generated output, together with @wunderfeyd. Looks like there's a double encoding when using no_markup = true and no_entities = true.

Input: **<script>**
Output: <p><strong>&amp;lt;script></strong></p>
Input: [text<text](link)
Output: <p><a href="link">text&amp;lt;text</a></p>

It doesn't happen when using no_markup = true only.

jdufresne commented 9 years ago

I recently hit this same issue: Using the test script:

$my_text = '**foo & bar**';
$parser = new \Michelf\Markdown();
$parser->no_markup = true;
$parser->no_entities = true;
$my_html = $parser->transform($my_text);
echo $my_html;

This outputs:

<p><strong>foo &amp;amp; bar</strong></p>

However I would expect:

<p><strong>foo &amp; bar</strong></p>
michelf commented 9 years ago

Indeed, that shouldn't happen. I'll get down to it eventually, but I'd also be happy to accept a pull request.

Maintaining those two modes is somewhat cumbersome given there's no way to test the non-defaut mode in MDTest.

markseuffert commented 9 years ago

We don't use no_entities = true anymore, no encoding problems without it.

michelf commented 9 years ago

I guess it's fixed in Yellow, but this issue about double-encoding is still present in PHP Markdown when no_entities = true, and is still worth fixing. So I'll keep it open.