Closed petko closed 3 months ago
Are you expecting the output to be
line 1
line 2
to be turned into html as
<body>
<p>line 1<br/>line 2</p>
</body>
or
line 1
line 2
to be converted to
<body>
<p>line 1</p>
<p>line 2</p>
</body>
perhaps with some CSS to increase spacing between paragraphs?
I think the first html would only show up if the markdown was
line 1<br/>
line 2
GitHub's Markdown parser tends to honor line breaks in the Markdown source, but given that the CommonMark parser has special syntax for specifying line breaks I don't think that's actually part of Markdown itself. That is, the only portable ways to force Markdown to put a newline between the "1" and the following "line" are to put them in separate paragraphs (with a blank line between them) or to use a <br/>
HTML tag between them.
Given there are no <p>
tags in your input, I think the output is a decent guess at the intended meaning of the provided HTML.
Out of curiousity, does the program behave more like what you expect if you replace the <br>
tags with <p>
tags (or a </p><p>
sequence, if you don't mind also inserting a <p>
after <body>
and a </p>
before </body>
)?
Are you expecting the output to be
line 1 line 2
Yes, that is why I expect with this HTML markup.
P.S.: My app does not generate such HTML, It is just something that I was testing..
FWIW: I think the
should be a newline wherever its encountered. The current implementation seems to ignore it in a paragraph:
html2 = """
<p>Contact: <br/> Isabella Bobillo <br/> Fish Consulting <br/> 954-893-9150 <br/>ibobillo@fish-consulting.com</p>
"""
print(pyhtml2md.convert(html2))
and the output is:
Contact: Isabella Bobillo Fish Consulting 954-893-9150 ibobillo@fish-consulting.com
But I'd expect it to be:
Contact:
Isabella Bobillo
Fish Consulting
954-893-9150
ibobillo@fish-consulting.com
Note that there are two spaces (per Markdown spec) at the end of each line of that output except the last.
Regarding the OP's:
line 1<br>
line 2<br>
I agree with the last comment about what is expected.
Yes, you are right, html2md seems to have problems with line brakes with a closing tag inside (<br/>
). With only <br>
it seems to work. Fixing it...
Should be fixed with the latest commit, will create a new release soon...
Describe the bug A have a simple HTML with
<br>
tags at the end of the lines and they are not converted properly.To Reproduce Run
html2md.exe breaks.html -p
with the following HTML document:You will get:
Expected behavior Should convert
<br>
to a new line instead.