evilstreak / markdown-js

A Markdown parser for javascript
7.7k stars 863 forks source link

Better support for fenced code blocks #220

Closed pdehaan closed 9 years ago

pdehaan commented 9 years ago

I'm seeing some weird behavior when trying to parse code blocks (with and without specified languages).

Here's my input

Home

The quick brown paragraph jumps over the lazy lists.

> "Hooray for block quotes!"  
> — Peter

**Bold and a bulleted list:**

- One fish
- Two fish

_Italics and an ordered list:_

1. Red fish
2. Blue fish

---

## Indented code block:

    var name = 'Princess Bubblegum';

## Generic code block (fenced):

var fs = require('fs'); fs.readFileSync('home.md', 'utf8');


## JavaScript code block (fenced):

```js
var path = require('path');
path.join(__dirname, 'home.md');

Tables?

Name Value
Jake true
Finn true
Ice King false

And my generated output is:

``` html
<h1>Home</h1>

<p>The quick brown paragraph jumps over the lazy lists.</p>

<blockquote><p>&quot;Hooray for block quotes!&quot;<br/>&amp;mdash; Peter</p></blockquote>

<p><strong>Bold and a bulleted list:</strong></p>

<ul><li>One fish</li><li>Two fish</li></ul>

<p><em>Italics and an ordered list:</em></p>

<ol><li>Red fish</li><li>Blue fish</li></ol>

<hr/>

<h2>Indented code block:</h2>

<pre><code>var name = &#39;Princess Bubblegum&#39;;</code></pre>

<h2>Generic code block (fenced):</h2>

<p><code>
var fs = require(&#39;fs&#39;);
fs.readFileSync(&#39;home.md&#39;, &#39;utf8&#39;);
</code></p>

<h2>JavaScript code block (fenced):</h2>

<p><code>js
var path = require(&#39;path&#39;);
path.join(__dirname, &#39;home.md&#39;);
</code></p>

<hr/>

<h2>Tables?</h2>

<table><thead><tr><th>Name</th><th>Value</th></tr></thead><tbody><tr><td>Jake</td><td>true</td></tr><tr><td>Finn</td><td>true</td></tr><tr><td>Ice King</td><td>false</td></tr></tbody></table>

It's a bit tricky to read, but you can see that the indented code blocks use <p><code> instead of <pre><code>.

pdehaan commented 9 years ago

You can also see that &mdash; Peter gets converted to &amp;mdash; Peter, which doesn't seem right.

pdehaan commented 9 years ago

Oh, and here was my test case:

'use strict';

var fs = require('fs');

var markdown = require('markdown').markdown;

var output = markdown.toHTML(getFile('./test.md'), "Maruku");

console.log(output);

function getFile(src) {
  return fs.readFileSync(src, 'utf-8');
}
codingisacopingstrategy commented 9 years ago

I’m not sure the Maruku dialect actually supports fenced code blocks? From what I found on the internet, not by default anyway. Since the parser doesn’t recognise fenced code blocks it is falling back to the default behaviour defined by Gruber cf https://github.com/evilstreak/markdown-js/issues/223

ashb commented 9 years ago

Fenced code blocks are from Github Flavoured Markdown (and probably a few others) which has been started but needs more work in #41.

The &amp;mdash is related to #16 - we don't currently support HTML (either tags or entities) so we escape everything.