Open tats-u opened 1 year ago
The phrasing is a bit weird in my opinion, “rendered in HTML”, more like: “when compiled to HTML, a soft line break may be shown as a line ending or as a space”.
To recap this issue:
Right?
Compared to your suggestion, I don’t think it’s good to mention deep specs. How about:
- (A soft line break may be rendered in HTML either as a [line ending](https://spec.commonmark.org/0.30/#line-ending) or as a space. The result will be the same in browsers. In the examples here, a [line ending](https://spec.commonmark.org/0.30/#line-ending) will be used.)
+ (A soft line break may be shown by browsers as a [line ending](https://spec.commonmark.org/0.30/#line-ending), a space, or nothing at all. In the examples here, a [line ending](https://spec.commonmark.org/0.30/#line-ending) will be used.)
I’d also personally prefer to be a bit stronger in our markdown spec, and say that we actually specify \n
-> \n
(trimmed)?
“when compiled to HTML, a soft line break may be shown as a line ending or as a space”
It is much clearer than the expression in the spec.
CSS was changed to allow browsers to be smarter in some cases
Correct. The first change is introduced in the Working Draft 15 of the Text Module Level 3 in 2011.
Otherwise, if the script context on one side of the line feed is Hangul, then the line feed is converted to a space (U+0020). Otherwise, if the East Asian Width property [UAX11] of both the character before and after the line feed is F, W, or H (not A), then the line feed is removed.
The behavior changed to browsers-defined in 2021 because of https://github.com/w3c/csswg-drafts/issues/5086. A strict rule existed in the version just before it. (WebKit-based browsers and IE didn't follow it at all though)
As you know, no browsers except for Firefox have not followed since today even though more than 10 years passed. Firefox changed its behavior in 2008.
Some browsers now have different defaults, so this text in the CM spec is no longer correct
We might have to say "The CM spec has ignored the behavior of some browsers." instead. It depends on when the first CM spec before v0.5 (in 2014) was published. Firefox's current behavior has existed since 2008. I don't believe Firefox's change is earlier because Markdown seems to have been born in 2004. At least we can't say "now" because Firefox's change is as many as 15 years old.
Compared to your suggestion, I don’t think it’s good to mention deep specs.
FYI, at first I thought HTML itself had decided the rule and tried to find one in the HTML spec, but I couldn't. Finally I found it in the CSS spec instead. I do not want readers of the CM spec to repeat the same mistake. I want those who want to find the most basic specification to access to the CSS spec first instead of the HTML spec.
(A soft line break may be shown by browsers as a line ending, a space, or nothing at all. In the examples here, a line ending will be used.)
It will be clearer if we split the description in the former sentence into 2 phases:
A soft line break must be converted to (rendered as) a line ending or a space in HTML. In the examples here, a line ending will be used. A line ending in HTML is rendered as a space or simply removed by browsers.
at first I thought HTML itself had decided the rule
For HTML, it’s all “inter-element whitespace”. CM cares about HTML, not really about CSS.
I do not want readers of the CM spec to repeat the same mistake.
Can you put this “mistake” into concrete words? What are you worried about that other people might do?
I want those who want to find the most basic specification to access to the CSS spec first instead of the HTML spec.
What?
It will be clearer if we split the description in the former sentence into 2 phases:
I don’t want to talk about CSS, just the markdown -> html part? I feel like it’s better to not touch on CSS if we don’t need it, and keep it simple?
Can you put this “mistake” into concrete words? What are you worried about that other people might do?
I thought HTML also had a rule of how to render “inter-element whitespace" in the screen and tried to find one in the HTML spec first. I worry other people who want to find a rule like me turn the HTML spec, not the CSS spec, upside down first, too.
What?
Could you tell me what follows after that "What"?
I don’t want to talk about CSS, just the markdown -> html part?
We wouldn't have to mention CSS if the CM spec banned conversion the soft line break to other than a newline. If it allows to convert it to " " or "", we need to encourage developers of formatters (, renderers ,)and converters that convert the soft linebreak to a space or remove it to align those conversion rules with the rendering rules in browsers.
I want to the CM spec to mean either of the following two (1. or 2.):
Once we describe the details of "the way browsers display line breaks" in 2-iii, we won't be able to help mentioning the CSS spec.
https://spec.commonmark.org/0.30/#softbreak
The description on a soft line break looked ambiguous or questionable for me.
Does it mean Markdown-to-HTML converters are allowed to convert a soft line break in Markdown to either of "
\n
" (or possibly "\r
" or "\r\n
") or "`" in HTML? Generally "may" in specifications means "is allowed to and does not have to" (RFC 2119) and confused me. I have no idea when "a soft line break is rendered in HTML as a line ending". Is it when
whitespace: preserve` or some other values is passed in CSS? Also is there a case when a soft line break is rendered in HTML as other than a line ending or a space?This is wrong. How "
\n
" in HTML is rendered differs among browsers when Chinese or Japanese are contained.https://drafts.csswg.org/css-text-4/#line-break-transform
This means how a soft line break is rendered depends on browsers' implementations.
↓
Only Firefox follows this recommendation as of now. (However, spaces are inserted like the other browsers when copied and pasted on somewhere else!)
https://codepen.io/tats-u/pen/YzdKKyN
↑Firefox (intended)
↑Edge (WebKit / Blink / IE; not intended; space after "," is selected)
https://codepen.io/tats-u/pen/poQQVyR (what kind of CJK letters remove a newline between them? → Korean is treated like alphanumeric characters unlike Japanese)
↑ FIrefox
↑Edge (and other WebKit & Blink based browsers / IE)
https://spec.commonmark.org/dingus/?text=%23%23%20%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%81%A8%E4%B8%AD%E5%9B%BD%E8%AA%9E%E3%81%AE%E4%BE%8B%0A%0A%E3%81%93%E3%82%8C%E3%81%AF%E6%97%A5%E6%9C%AC%0A%E8%AA%9E%E3%81%AE%E6%96%87%E7%AB%A0%E3%81%A7%0A%E3%81%99%E3%80%82%E8%BF%99%E6%98%AF%E4%B8%80%0A%E4%B8%AA%E4%B8%AD%E6%96%87%E5%8F%A5%E5%AD%90%E3%80%82%0A%0A
↑Firefox (looks natural)
↑Edge (Wekit / Blink / IE; doesn't look natural)
From these results, we can conclude only "
" in Firefox.
\n
" between Chinese or Japanese letters (han/kana) or punctuation marks is removed instead of replaced with "Also,
This sentence must be replaced with like: