thephpleague / html-to-markdown

Convert HTML to Markdown with PHP
MIT License
1.77k stars 204 forks source link

Line breaks inside tag #217

Open multiwebinc opened 2 years ago

multiwebinc commented 2 years ago

Version(s) affected

5.0.2

Description

Line breaks inside tags produce incorrect markdown

How to reproduce

HTML:

<b>Hello<br><br>World</b>

Output:

**Hello  

world**

Expected output:

**Hello**

**world**
colinodell commented 2 years ago

This is an interesting case that could have three possible desired outputs based on one's philosophy of how this library should work.

You've already illustrated one case, where you expect the library to produce Markdown that, if converted back to HTML, produces results that are visually similar to users but with different HTML:

<p><strong>Hello</strong></p>
<p><strong>World</strong></p>

Another philosophy would be that this library should strive to produce Markdown like this:

**Hello<br><br>World**

Which converts back into virtually-identical HTML like this:

<p><strong>Hello<br><br>World</strong</p>

(That is the approach that I personally prefer)

Lastly, there's a third philosophy that's kind of a hybrid of the two which would give Markdown like this:

**Hello\
\
World**

This produces:

<p><strong>Hello<br />
<br />
world</strong></p>

But I don't think that's something anyone would really want or expect :)

Regardless, I agree that this is a bug.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

multiwebinc commented 2 years ago

This should probably be reopened.