thephpleague / html-to-markdown

Convert HTML to Markdown with PHP
MIT License
1.77k stars 205 forks source link

Single-line language code blocks are wrongly turned into language code spans #161

Closed MrPetovan closed 6 years ago

MrPetovan commented 6 years ago

Per CommonMark spec (https://spec.commonmark.org/0.28/#fenced-code-blocks), this Markdown code

```php
return $return;
outputs the following HTML:
```html
<pre><code class="language-php">return $return;</code></pre>

But fed into the HTML-to-Markdown convert, this HTML gives the following Markdown output:

`php return $return;`

I understand this is intentional per https://github.com/thephpleague/html-to-markdown/pull/102 but this isn't expected.

Please revert this change concerning the single-line code blocks to fix this inconsistency with CommonMark.

We're using the HTML-to-Markdown converter over at https://github.com/friendica/friendica to communicate with Diaspora and we expect a reversible conversion between CommonMark and HTML.

colinodell commented 6 years ago

Thanks for reporting this! I have modified the library to always use multi-line blocks if a language is present. I'll tag bugfix release 4.7.1 once the Travis tests pass.

MrPetovan commented 6 years ago

Thanks, I was about to submit a much larger patch taking into account the parent element of the <code> tag, which is the deciding factor over whether it should be a multiline code block or not. <code> is a preformatted inline tag and as such in HTML, newlines aren't preserved unlike with the <pre>.

colinodell commented 6 years ago

Ah okay, well if you'd like to submit that I'd gladly accept it over my quick fix :) I'll hold off on releasing then.

MrPetovan commented 6 years ago

By all means, release away!

colinodell commented 6 years ago

Done! https://github.com/thephpleague/html-to-markdown/releases/tag/4.8.0 Thanks again for your help!