erusev / parsedown

Better Markdown Parser in PHP
https://parsedown.org
MIT License
14.69k stars 1.12k forks source link

Regression in 1.8.0: Line with formatting immediately following line with HTML doesn't render #721

Closed lukasbestle closed 4 years ago

lukasbestle commented 5 years ago

Hi,

One of our Kirby customers reported a bug in Parsedown 1.8.0-beta7 (see https://github.com/getkirby/kirby/issues/1943):

Markdown input:

<p>Simple HTML</p>
# Test

Output from Parsedown 1.7.3:

<p>Simple HTML</p>
<h1>Test</h1>

Output from Parsedown 1.8.0-beta7:

<p>Simple HTML</p>
# Test

Code example:

$markdownWithHtml = <<<MARKDOWN
<p>Simple HTML</p>
# Test
MARKDOWN;

$parsedown = new Parsedown();
echo $parsedown->text($markdownWithHtml);
aidantwoods commented 5 years ago

Hi there, this is intended behaviour and replicated in the reference parser.

Reasoning for this is that the line <p>Simple HTML</p> starts a HTML block (of type 6 in this case), which requires a blank line to end it per the CommonMark spec. This seems a bit weird at first, but (AFAIK) this is because CommonMark tries to avoid having to parse potentially more complicated HTML in a correct way—instead it roughly says that if something looks like it starts a block of HTML then treat it like HTML until the next blank line (there are a few special cases where this differs, but this is the gist). This generally works well, but can sometimes be surprising (as you point out here).

aidantwoods commented 5 years ago

Basically, the way around this treatment of HTML blocks (and generally with CommonMark things) is to separate separate things by blank lines where possible, i.e. write:

<p>Simple HTML</p>

# Test
aidantwoods commented 5 years ago

Looking at the referenced issue though, this is a bug when the HTML block of concern is script, pre, or style however. This is fixed in the 2.0.x branch but not yet in 1.8.x-beta

lukasbestle commented 5 years ago

Thanks for the quick reply! Is there any timeframe for a release of the 2.0.x branch? Or if not, would you recommend using the current state of the 2.0.x branch over 1.8.0-beta7 in production?

schulzrinne commented 4 years ago

This is a problem even if the post-HTML text is separated by a blank line. For example, pasting the following after the demo text leaves the "Regards" part unparsed. Both <h4> and <p> are block-level elements and there's a blank line between the

and the "Regards".

<h4 >Overall score: How do you score this paper?</h4><p >weak accept (3)</p>

Regards,
The Conference Chairs & TPC

aidantwoods commented 4 years ago

@schulzrinne this issue was focussed on 1.8-beta (the demo page still uses 1.7 afaik) does this happen in the latest release of the beta branch?

schulzrinne commented 4 years ago

Yes, 1.8.0-beta-7 appears to fix this issue. Thanks for the quick follow up.

On Sun, Aug 4, 2019 at 3:56 PM Aidan Woods notifications@github.com wrote:

@schulzrinne https://github.com/schulzrinne this issue was focussed on 1.8-beta (the demo page still uses 1.7 afaik) does this happen in the latest release of the beta branch?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/erusev/parsedown/issues/721?email_source=notifications&email_token=AANHS5CAQPQPIGEZCFM3KULQC4X7RA5CNFSM4IGSWO52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3QIQ5Q#issuecomment-518031478, or mute the thread https://github.com/notifications/unsubscribe-auth/AANHS5CZ53RLOGRXDJXG25TQC4X7RANCNFSM4IGSWO5Q .

lukasbestle commented 4 years ago

@aidantwoods The original issue I've reported is not resolved (I tested that with 1.8.0-beta7 as I wrote in the original post). Could you please reopen the issue?