erusev / parsedown

Better Markdown Parser in PHP
https://parsedown.org
MIT License
14.76k stars 1.12k forks source link

[Bug] Code block splitted (1.7.4, 2.0.0 Beta 1) #836

Open Tooa opened 2 years ago

Tooa commented 2 years ago

Description

Parsedown splits a code block in multiple parts when having a markdown file like the following:

* Source
```bash
$ ls

# Comment in code
$ ls

The issue is present in the latest stable release and the latest public beta.

## Expected Behavior

```html
<ul>
<li>Source</li>
</ul>
<pre><code class="language-bash">$ ls
# Comment in code</span>

$ ls
</code></pre>

Actual Behavior

<ul>
<li>Source
<pre><code class="language-bash">$ ls
</code></pre>
</li>
</ul>
<h1>Comment in code</h1>
<p>$ ls</p>
<pre><code></code></pre>

Steps to reproduce

Reproduce with Parsedown 1.7.4

Reproduce with Parsedown 2.0.0 Beta 1

Setup

$ sudo apt install php8.1
$ php -r "copy('https://getcomposer.org/installer', 'composer-setup.php');"
$ php composer-setup.php
$ php -r "unlink('composer-setup.php');"
# Dependencies
$ sudo apt-get install php8.1-mbstring
$ php ../composer.phar require erusev/parsedown:v2.0.0-beta-1
$ php demo.php

demo.php

<?php

require __DIR__ . '/vendor/autoload.php';

use Erusev\Parsedown\Configurables\Breaks;
use Erusev\Parsedown\Configurables\SafeMode;
use Erusev\Parsedown\Configurables\StrictMode;
use Erusev\Parsedown\State;
use Erusev\Parsedown\Parsedown;

$markdown = <<<EOD
* Source
```bash
$ ls

# Comment in code
$ ls

EOD;

$state = new State([ new Breaks(true), new SafeMode(true), new StrictMode(false) ]);

$Parsedown = new Parsedown($state); echo $Parsedown->toHtml($markdown); ?>

aidantwoods commented 2 years ago

Thanks for reporting this! This one in particular is a known issue, and has the same root cause as your other issue: when parsing a list continuation, we should first check to see if the line can start a new block, and if so, allow that to interrupt the list (provided that the indentation is not sufficient to contain the block in the list).

Worth noting that we can't support all of CommonMark's precedence rules (particularly for inlines) with Parsedown's parsing technique, and that Parsedown's parsing method is going to need to be special cased a little to deal with this. I think that on balance it is probably worth doing for resolving this ambiguity in block precedence, and so I'll aim to fix this in the v2 branch.

aidantwoods commented 2 years ago

See #707 for the more general issue. I'd originally closed this as a won't fix, but I have reconsidered and think that this can be implemented without introducing an unreasonable amount of complexity.