getkirby / kirby

Kirby's core application folder
https://getkirby.com
Other
1.32k stars 168 forks source link

Indented HTML in Markdown text rendered as code block #1912

Closed hdodov closed 5 years ago

hdodov commented 5 years ago

Describe the bug If a custom kirbytag outputs HTML with whitespace (from a snippet, for example), using kirbytext() in a template will work incorrectly. The HTML will be wrapped in <pre> tags because of the whitespace since Markdown would parse it as a code block. There's a thread in the forum.

To Reproduce Create a similar plugin:

Kirby::plugin('me/myplugin', [
  'tags' => [
    'block' => [
      'html' => function ($tag) {
        return <<<HTML

        <div>
          <h1>testing</h1>
        </div>

HTML;
      }
    ]
  ]
]);

In a template, add this:

<?= kirbytext('(block: test)') ?>

And the resulting markup would be:

<pre><code>    &lt;div&gt;
      &lt;h1&gt;testing&lt;/h1&gt;
    &lt;/div&gt;</code></pre>

Expected behavior The resulting markup should be:

        <div>
          <h1>testing</h1>
        </div>

Kirby Version 3.2.0


In my project code, I tried to use $field->markdown()->kirbytags() instead of $field->kirbytext(). While the HTML output was alright, there were excess <p> tags. Another solution was to remove the whitespace form the markup with regex:

return preg_replace('/\s+/', ' ', $markup);

While this works, I believe it's a bad solution that can lead to many unexpected results.

In my opinion, content rendered by kirbytags should not be altered further by any means. The custom markdown parser should treat kirbytags as if they're not there, parse the rest, and then slap the kirbytag contents on the final string.

distantnative commented 5 years ago

The issue here is I guess the order of conversion. Currently it seems like

  1. convert KirbyTags
  2. convert Markdown

Your described case would better work with

  1. convert Markdown
  2. convert KirbyTags

However, that has the disadvantages that you couldn't use Markdown in KirbyTags anymore.

lukasbestle commented 5 years ago

However, that has the disadvantages that you couldn't use Markdown in KirbyTags anymore.

I think that's not a huge deal as it's possible to render Markdown inside the custom KirbyTag if desired. It would be a breaking-change in that regard, but probably fine.

But there's another issue with this: If you first render Markdown, the Markdown parser will print <p> tags around the KirbyTags. Those will still be in the output after the KirbyTag has rendered, which is pretty bad for block-level KirbyTags that output stuff like <figure> tags. This was the reason why we chose the current rendering order.

I know that this means that custom KirbyTags need to be built a bit more carefully, but it's the only way that will work with all use-cases. What you could do to fix your code example is the following:

Kirby::plugin('me/myplugin', [
  'tags' => [
    'block' => [
      'html' => function ($tag) {
        return <<<HTML

        <div>
          <h1>testing</h1>
        </div>

        HTML;
      }
    ]
  ]
]);

What I did is to indent the closing marker of the heredoc marker (new feature in PHP 7.3). This will automatically strip the leading whitespace from all lines before.

Of course, this only applies to inline HTML code. If you use a snippet, the workaround is not to indent the snippet code (which is the Kirby best-practise anyway).

hdodov commented 5 years ago

If you use a snippet, the workaround is not to indent the snippet code (which is the Kirby best-practise anyway).

The Kirby best-practice is not to have good readability in the codebase? I'm pretty sure @bastianallgeier would disagree on that. I have a snippet that I use in my templates. I want to be able to use that same snippet in the panel via kirbytags. It opens much possibilities. Not being able to indent the snippet code would be problematic the moment I add more than 3 tags, which is my case. It would also mean that I have to rewrite my snippet once I decide it would be a good idea to have it available in a kirbytag, and that would suck.

Processing the kirbytag markup makes no sense anyway. If the tag returns HTML, that means it has done every formatting it needs, as the final form of markup is HTML. It could also lead to some strange behavior in the case of text enclosed in _, *, etc, I suppose. Besides, if the kirbytag does need markdown formatting, it can use the Kirby classes to format whatever it needs, the way it needs.

Changing the way kirbytags work is not the solution, I understand the reason they work this way, and that's OK. However, I think that kirbytags should be able to identify themselves as "HTML generators" in the config and have no markdown (or any other type of) formatting applied to them. It will open the doors for more advanced kirbytags that contribute more than text.

lukasbestle commented 5 years ago

Maybe there's a misunderstanding. It's completely fine (and best-practice) to write snippets like this:

<div>
  <h1>testing</h1>
</div>

This will not trigger the Markdown parser because it sees fully-rendered HTML code that is not part of a code block.

It is however not best-practice to write your snippets like this:

    <div>
      <h1>testing</h1>
    </div>
texnixe commented 5 years ago

The Kirby best-practice is not to have good readability in the codebase?

I'm pretty sure that is a misunderstanding and @lukasbestle meant something else.

texnixe commented 5 years ago

The problem is, however, that it's not only indents at the beginning, but also standard indented snippets, at least for 4 spaces and when using empty lines.

lukasbestle commented 5 years ago

Ah, now I know what @hdodov probably meant: The bug where empty lines within a block of HTML generate a code block is a bug in the Parsedown library we use to parse Markdown. I'm pretty sure we have reported that issue to Parsedown quite a while ago already, but I can't seem to find the GitHub issue.

lukasbestle commented 5 years ago

What I'm saying is: This is not intended behavior because the Markdown parser shouldn't do anything within HTML tags (if the outermost tags are not indented like in my best-practice example above).

This is a separate bug that not only applies to KirbyTags but to all Markdown parsing in Kirby. If this is fixed, the issue with the KirbyTag should automatically be fixed as well.

hdodov commented 5 years ago

Bummer. Can you link to that Parsedown issue so we can have it here as well? If it's not likely to be fixed soon and a workaround as I've mentioned is possible without much effort, is there a chance to get that?

lukasbestle commented 5 years ago

It looks like this issue has actually been fixed, because I tested it on their demo page with the following code:

# HTML demo

This is a demo with some HTML code in it:

<div>
    <h1>This is a test</h1>

    <p>Lorem ipsum</p>

    <p>And another paragraph</p>
</div>

The result was fine even with the indentation. So @hdodov have you verified that the first and last lines of your snippets are not indented? It should actually already work.

hdodov commented 5 years ago

It's kind of strange indeed. I pasted my snippet in the demo page you linked and it works. In my template - it doesn't. I'll try to provide a minimal reproducible example.

hdodov commented 5 years ago

It seems that indentation doesn't cause the problem - empty lines do, and it seems to be a problem with Kirby.

Snippet test.php:

<div>

    <span>
        text
    </span>

</div>

Plugin index.php:

Kirby::plugin('me/plugin', [
    'tags' => [
        'block' => [
            'attr' => [
                'type'
            ],
            'html' => function ($tag) {
                return snippet('test', [], true);
            }
        ]
    ]
]);

Output:

<div>
<pre><code>&lt;span&gt;
    text
&lt;/span&gt;</code></pre>
</div>

Meanwhile, if you input the same markup in the Parsedown demo, it returns it without <pre> tags, as it came in.

However, if you change the snippet contents to:

<div>
    <span>
        text
    </span>
</div>

Kirby renders it correctly.


While you could avoid using empty lines in your source code, they can still appear as the result of conditionals or in case some of the markup comes from an external source. In general, having any sort of restrictions for your source code would be tricky to deal with.

lukasbestle commented 5 years ago

Thanks for the code example, I will check that.

lukasbestle commented 5 years ago

I have run your code example through the Parsedown version we use for Kirby (1.8.0-beta-7). We use that over the latest stable version (which they use on their demo page) because of some other fixes we need for Kirby.

$markdownWithHtml = <<<MARKDOWN
<div>

    <span>
        text
    </span>

</div>
MARKDOWN;

$parsedown = new Parsedown();
echo $parsedown->text($markdownWithHtml);

Unfortunately I get the same wrong result as you, so this seems to be a regression in Parsedown 1.8.0. This is the second Parsedown bug that was reported to us (after https://github.com/erusev/parsedown/issues/721), so we will look into replacing Parsedown with a different parser (see #1961). This is something we can only do in the next larger release (3.3.0) because of the larger effects this change will have. I hope you understand.

I will definitely test your code example with the new parser we choose. :)

lukasbestle commented 5 years ago

Closing this in favor of #1961.

mrflix commented 1 year ago

I'm running into this bug after upgrading a Kirby website from v2 to v3.8.2.2. The following html, copy pasted from cleverreach, which the editors frequently do, creates the mysterious code block:

<form class="layout_form cr_form cr_font" action="https://eu1.clevrreach.com/f/81175-224003/wcs/" method="post" target="_blank">
    <div class="cr_body cr_page cr_font formbox">
        <div class="non_sortable" style="text-align:left;">

        </div>

        <div class="editable_content" style="text-align:left;">
        <div id="4511050" rel="email" class="cr_ipe_item ui-sortable musthave" style="margin-bottom:px;">
<label for="text4511050" class="itemname">E-Mail*</label> <input id="text4551060" name="email" value="" type="text" style="width:100%;" />
</div><div id="4511052" rel="text" class="cr_ipe_item ui-sortable" style=" margin-bottom:0px;">
<label for="text4511052" class="itemname">Vorname</label> <input id="text4511022" name="1100141" type="text" value="" style="width:100%;" />
</div><div id="4511054" rel="text" class="cr_ipe_item ui-sortable" style=" margin-bottom:0px;">
<label for="text4511054" class="itemname">Nachname</label> <input id="text4511061" name="1000152" type="text" value="" style="width:100%;" />
</div><div id="4511055" rel="button" class="cr_ipe_item ui-sortable submit_container" style="text-align:center; margin-bottom:px;">
<button type="submit" class="cr_button">Anmelden</button>
</div>
    </div>

    <noscript><a href="http://www.cleverreach.de">www.CleverReach.de</a></noscript>
</div>

</form>

Screenshot 2022-12-01 at 13 36 53

When I format the html, it doesn't break:

<form class="layout_form cr_form cr_font" action="https://eu1.clevrreach.com/f/81175-224003/wcs/" method="post" target="_blank">
  <div class="cr_body cr_page cr_font formbox">
    <div class="non_sortable" style="text-align:left;">
    </div>
    <div class="editable_content" style="text-align:left;">
      <div id="4511050" rel="email" class="cr_ipe_item ui-sortable musthave" style="margin-bottom:px;">
        <label for="text4511050" class="itemname">E-Mail*</label> <input id="text4551060" name="email" value="" type="text" style="width:100%;" />
      </div>
      <div id="4511052" rel="text" class="cr_ipe_item ui-sortable" style=" margin-bottom:0px;">
        <label for="text4511052" class="itemname">Vorname</label> <input id="text4511022" name="1100141" type="text" value="" style="width:100%;" />
      </div>
      <div id="4511054" rel="text" class="cr_ipe_item ui-sortable" style=" margin-bottom:0px;">
        <label for="text4511054" class="itemname">Nachname</label> <input id="text4511061" name="1000152" type="text" value="" style="width:100%;" />
      </div>
      <div id="4511055" rel="button" class="cr_ipe_item ui-sortable submit_container" style="text-align:center; margin-bottom:px;">
        <button type="submit" class="cr_button">Anmelden</button>
      </div>
    </div>
    <noscript><a href="http://www.cleverreach.de">www.CleverReach.de</a></noscript>
  </div>
</form>

Screenshot 2022-12-01 at 13 38 01

Can you reopen this issue?

lukasbestle commented 1 year ago

@mrflix I'm sorry you ran into this issue. Unfortunately, opening the issue wouldn't help much as it's currently not actionable. As discussed, Parsedown's Markdown implementation is buggy in many places, so only a switch to a different Markdown library would solve this and other Markdown parsing issues once and for all. We do still have plans to offer alternative Markdown parsers, but that is also not a no-brainer as discussed in #1961.

As a workaround, you can already use your own Markdown parsing library by replacing the markdown component.