syntax-tree / mdast-util-to-markdown

mdast utility to serialize markdown
http://unifiedjs.com
MIT License
100 stars 20 forks source link

Round tripping double-space line break removes line break #30

Closed NickSto closed 3 years ago

NickSto commented 3 years ago

Initial checklist

Affected packages and versions: mdast-util-to-markdown 0.6.5

Steps to reproduce

Input:

test  
break

Code used:

const process = require('process');
const unified = require('unified');
const unifiedStream = require('unified-stream');
const remarkParse = require('remark-parse');
const remarkStringify = require('remark-stringify');

const processor = unified()
  .use(remarkParse)
  .use(remarkStringify);

process.stdin.pipe(unifiedStream(processor)).pipe(process.stdout);

Expected behavior

Expected output:

test  
break

Actual behavior

Actual output:

test\
break

I'm not sure why it's joining these lines with a backslash + line break. That's like the opposite of a break.

I don't think it's an issue with the parser; the tree seems correct as far as I can tell:

{
  type: 'paragraph',
  children: [
    { type: 'text', value: 'test', position: [Object] },
    { type: 'break', position: [Object] },
    { type: 'text', value: 'break', position: [Object] }
  ],
  position: {
    start: { line: 1, column: 1, offset: 0 },
    end: { line: 2, column: 6, offset: 12 }
  }
}

Is there something I'm missing?

The break handler seems to have '\\\n' hardcoded, and in no case will return what I expect (' ​ \n'). Am I misunderstanding what a break is supposed to be? If so, why is remark-parse returning it for a double-spaced line ending?

I could apparently fix this by giving it the option {handlers: {'break': _ => ' ​ \n'}} but I assume this isn't the default for a reason?

Environment

mdast-util-to-markdown: 0.6.5 remark-stringify: 9.0.1 remark-parse: 9.0.0 yarn: 1.22.10 node: v14.15.1 OS: Ubuntu 20.04.2

NickSto commented 3 years ago

Okay apparently what I'm missing is that '\\\n' is another, valid syntax for a break?

For some reason my system isn't rendering a <br/> there.* In any case, I feel like the double-space syntax is far more common. An option to prefer that would be nice.

* I'm using @gridsome/transformer-remark, which apparently just uses remark-parse and remark-html, which doesn't omit the <br/> when I test it in isolation. Not sure why that happens.

wooorm commented 3 years ago

Yep, that’s valid markdown:

a\
b

^-- a break.

Before commonmark, you had to use two spaces. But that’s quite a bad idea imo.


Indeed, it’s Gridsome using old remark. You might pass commonmark: true to that version of remark to support it though (being spec compliant used to be an option, before the spec got widely supported). But it might be better to get gridsome to update. Or use remark directly / through something else?