showdownjs / showdown

A bidirectional Markdown to HTML to Markdown converter written in Javascript
http://www.showdownjs.com/
MIT License
14.26k stars 1.56k forks source link

Render multiple newlines #801

Open Rafi993 opened 4 years ago

Rafi993 commented 4 years ago

If the text contains multiple newlines

Hello\n\n\nThere

then only one newline was rendered. So I tried this approach to solve this

var converter = new showdown.Converter(),
  description = data.description.replace(/\n{2,}/g, m => m.replace(/\n/g, "<br/>"));
description = description.replace(/<br\/>([^<])/g, "<br\/>\n\n$1");
var html = converter.makeHtml(description);

But the problem with the above solution is if the markdown contains codeblock then <br> tags get render inside the codeblock is there better a way to solve this?

sbwp commented 1 year ago

This is definitely a major bug in showdown, since it really messes up Markdown input with its aggressive removal of blank lines.

Even though this issue is a few years old, I'll share my workaround, since I'm sure others will run into this issue even if it's too late for the OP. I haven't tested this extensively yet and adjustments may be needed depending on your particular use case and what flavor of markdown you're using, but I ended up using this regex replacement:

markdown.replace(/(?<=\n)\n/g, '­  \n').replace(/(?<=­ {2})\n(?=---)/g, '\n\n');

The first replacement value is a soft hyphen (U+00AD) followed by two spaces and a newline. You might be able to use a nonbreaking space instead of the soft hyphen, but in my case it was getting converted to &nbsp; and then getting rendered as &amp;nbsp;, possibly due to the showdown-htmlescape extension or possibly from the application where the HTML is being rendered.

Basically, it replaces all blank lines with a line containing a soft hyphen, so it will visually appear like a blank line. One issue this causes is that if you copy from a code block and paste, the soft hyphen will be copied. To fix this, assuming there's no reason for a soft hyphen to appear in your input Markdown, you can remove them from the HTML after converting (and I suppose at that point you can substitute any unicode string you don't think would appear in the input).

The second replace handles the edge case of a <hr> made up of hyphens (like ----). If you have text on a line before a horizontal rule, it gets rendered as a heading, so instead of a horizontal rule, I was getting headings containing a soft hyphen. This solves the problem by looking at all non-first newlines prior to a row of three or more hyphens (now identifiable by the soft hyphen we inserted) and replacing them with a double newline. Showdown will use the double newline to recognize it as a horizontal rule instead of a title, then reduce it to one line, leaving us with the original number of lines.

However, if you then remove the soft hyphens from the output in order to make them not appear in code snippets, it will cause some newlines to not appear at the end of paragraphs, such as the ones introduced by the second replace before horizontal rules. Luckily, we can solve this with one final more to the HTML. Before we remove all the soft hyphens, replace any soft hyphens at the end of a <p> tag with a nonbreaking space. Since code snippets will have already had occurrences of </p> transformed into &lt;/p&gt;, this shouldn't affect them.

// Both replace regular expressions start with a soft hyphen
html.replace(/­ ? ?(?=<\/p>)/g, '&nbsp;')
    .replace(/­ ? ?/g, '');

In my limited testing, I've found this to give the results I want.