developit / snarkdown

:smirk_cat: A snarky 1kb Markdown parser written in JavaScript
http://jsfiddle.net/developit/828w6t1x/
MIT License
2.29k stars 109 forks source link

newlines transformed to <br /> #11

Open gribnoysup opened 7 years ago

gribnoysup commented 7 years ago

I'm just curious shouldn't multiple newlines be transformed to paragraphs, not linebreaks? This was intended for some reason?

developit commented 7 years ago

That could be useful, and actually the block processor seems to be already set up to handle this. Worth looking into.

gribnoysup commented 7 years ago

Yay! I'll take a look ;)

patrickpietens commented 7 years ago

+1

woudsma commented 7 years ago

+1

ollicle commented 7 years ago

This is a surprising omission for a Markdown parser, but perhaps nearly a feature. I have long thought markdown for strictly inline markup would be useful. For example when the output is to be contained in an existing template paragraph. A fork to remove the support for headings and Snarkdown would be it…

developit commented 7 years ago

@ollicle what about code blocks and quotes?

nicobrinkkemper commented 7 years ago

Soo.. paragraphs are not supported?

developit commented 7 years ago

currently they are transformed to line breaks.

Jonarod commented 7 years ago

I know it will not correctly solve the issue, but as a fallback, if other people comes to have the same problem, I ended up forcing <p></p> and <br> before snarkdown conversion. As snarkdown correctly accepts HTML, it should render correct paragraphs. It goes like:

var myMarkdown = 'I am a line but I should be in my own paragraph\n\n*Here* starts another paragraph, and here is a line break\nI come**after**the line break but still in the same paragraph\n\n'

// replacing double lines \n\n with </p><p> then single lines \n with <br>
myMarkdown.replace(/\n\n/g, "</p><p>").replace(/\n/g, "<br>")

//Now converting to markdown and enclosing the output with <p></p>. 
// snarkdown preserves previously forced HTML tags
var html = '<p>' + snarkdown(myMarkdown)  + '</p>'

// Do whatever with the variable "html"

All in all, this creates empty trailing <p></p> tags at the end of the output... but still I have correct markdown for really cheap thanks to snarkdown... with an empty p tag at the end: I guess I can live with that until a better solution comes :)

EDIT: This works only on some contexts. The "hack" breaks quite quickly when we try to use basic block markdown like h1 etc... It will mess the whole thing. Use only for some inline markdown needs.

Jonarod commented 7 years ago

I have been through the code trying to find a way to sneak into it and make support for <p></p> tags.

I tried these options:

  1. Create a new regex that would basically match any paragraph: everything that starts with other thing than commons markdown blocks (blockquotes, lists, fences...). But inevitably leads to a loop, as hello is a paragraph returning hello which in turns matches a paragraph as well... Actually for post-reference, this crazy regex would correctly match paragraphs:

    /(^(?![-*+] |[-_*:`=]{3,}|\>|\t|\ {4}|\s|\||#{1,6}\s|\d+(?=\. )|<)(?:[\w\W])+?(?=\n(?:\n|\>|[-+*]\ .|\*{3,}\n|#{1,6}\s|[`:]{3,})|(\n[=-])))/gm

    Here I put together a test to see it in action with several edge cases.

  2. After that, I tried to stick with the regex approach, but using some conditionals to prevent further parses... But I found myself in the situation where _hello_ would output <p>_hello_</p> without parsing nested content.

  3. I changed my view on regex, and tried to do something AFTER snarkdown's output. I saw a bunch of <br /> that may be used to get an entry point of <p>. Here again, I can't seem to find the good path to adding correct <p></p> tags. In fact, After HTML is produced, I could barely surround "lonely" text with <p></p>, but there I was left with lone <em>, <strong> and inline friends hanging. In fact, it would have been easier with some AST to tell where blocks/inline/text start or ends.

Now I am new to javascript and programming in general, so I may have skipped some BIG OBVIOUS scheme: @developit what do you think would be the best approach here ?

You say:

That could be useful, and actually the block processor seems to be already set up to handle this. Worth looking into.

What did you mean ? Any help would do :)

RobbieTheWagner commented 6 years ago

I would love for this to support paragraphs. This breaks all the formatting of everything. I get a ton of newlines I do not want in my output.

pReya commented 5 years ago

Just another +1 here. Would really love to have support for proper paragraphs. Love the minimal approach of this library, but now I need to switch to larger parsers, just because I need support for <p> tags 😢

dangzo commented 5 years ago

That's my solution to the problem (taking inspiration from the comments on this issue):

let html = '<p>';
contentMd.split('\n\n').forEach(mdChunk => {
    html += `<p>${snarkdown(mdChunk)}</p>`;
});
html += '</p>';

Works well in most cases (for as far as my tested went). Conversion of a single newline into <br /> is till not supported. So, something like:

Hello,
I'm in a new line

Would still output:

<p>Hello, I'm in a new line</p>
manix84 commented 5 years ago

+1

kwiat1990 commented 3 years ago

I think that br is not what one is expecting. I mean, plain text should be enclose with a proprietary HTML tag, in this case a pair of <p></p>. Otherwise we make some damage to markup's semantics. I would love to see this enhancement as there is absolutely no alternative to snarkdown in terms of size.

EDIT: Sadly your solution produces in my markdown empty paragraphs.

aradalvand commented 3 years ago

I mean, plain text should be enclose with a proprietary HTML tag

Totally agree, I was disappointed to see that Snarkdown doesn't convert "markdown paragraphs into HTML paragraphs"! You'd certainly expect that by default.

kwiat1990 commented 3 years ago

I have come up with something like this:

function snarkdownEnhanced(markdown) {
  return markdown
    .split(/(?:\r?\n){2,}/)
    .map((l) =>
      [" ", "\t", "#", "-", "*", ">"].some((char) => l.startsWith(char))
        ? snarkdown(l)
        : `<p>${snarkdown(l)}</p>`
    )
    .join("\n");

With the addition of ">" (markdown for a blockquote) all the redundant paragraphs are gone. So basically lists and blockquotes won't produce empty paragraphs anymore.

valmormn commented 3 years ago

Soo.. paragraphs are not supported?

Fuck that shit

swyxio commented 2 years ago

Just to help out anyone finding this issue, kwiat1990's solution above worked, where Jonarod's solution breaks with H tags or code blocks

mesqueeb commented 2 years ago

The following works, but it will break when a sentence starts with a bold word. I was able to improve it by adding some spaces inside the [].some check:

BEFORE

function snarkdownEnhanced(markdown) {
  return markdown
    .split(/(?:\r?\n){2,}/)
    .map((l) =>
      [" ", "\t", "#", "-", "*", ">"].some((char) => l.startsWith(char))
        ? snarkdown(l)
        : `<p>${snarkdown(l)}</p>`
    )
    .join("\n");

changed "-", "*", ">" with "- ", "* ", "> "

AFTER

function snarkdownEnhanced(markdown) {
  return markdown
    .split(/(?:\r?\n){2,}/)
    .map((l) =>
      [" ", "\t", "#", "- ", "* ", "> "].some((char) => l.startsWith(char))
        ? snarkdown(l)
        : `<p>${snarkdown(l)}</p>`
    )
    .join("\n");