erusev / parsedown

Better Markdown Parser in PHP
https://parsedown.org
MIT License
14.8k stars 1.12k forks source link

Broken Headlines in the latest version #265

Closed bastianallgeier closed 9 years ago

bastianallgeier commented 9 years ago

Unfortunately the latest version comes with this pretty massive bug:

##heading

is being converted to

<p>##heading</p>
erusev commented 9 years ago

@bastianallgeier

This is expected. Headers need a space between the # and the text. This is dictated by the CommonMark spec. You are the second person to complain about it. Perhaps I should change it back. Requiring that space doesn't seem to add much value.

rhukster commented 9 years ago

Even in the not-so-specific Gruber spec, they have spaces:

http://daringfireball.net/projects/markdown/syntax#header

IMHO, I don' think you should support 'spaceless' header tags.

erusev commented 9 years ago

@rhukster

True, but perhaps a huge number of Markdown texts don't have these spaces.

bastianallgeier commented 9 years ago

I didn't even find it myself. I added it to Kirby's develop branch a couple hours ago and already got a ticket for it. I get that it's officially not documented that way, but I'm sure that lots of people write it like that.

erusev commented 9 years ago

@bastianallgeier

I believe that to be true. I'll release a patch within the next 10 min.

rhukster commented 9 years ago

True, I guess it depends on how close to the CommonMark spec you are trying to get. Is it possible to have a toggle/option at some point to enforce markdown or legacy?

bastianallgeier commented 9 years ago

@erusev awesome! I think it's a lot about usability in this case, so I'm glad you see it this way as well.

bastianallgeier commented 9 years ago

@rhukster @erusev I wouldn't add too many "toggles". They add complexity and probably negatively affect the performance. Where's the harm if it does follow the specs but also is slightly more forgiving?

erusev commented 9 years ago

The release is out. You can update now.

erusev commented 9 years ago

@rhukster

It is possible. It could be just one toggle. I'm considering it.

cebe commented 9 years ago

The only conflict I see with this, is that if you write markdown like this:


- #123 abc
- #456 def

you will get headlines instead of issue numbers...

rhukster commented 9 years ago

Well in this case the CommonMark explicitly says:

The opening sequence of # characters cannot be followed directly by a nonspace character.

http://spec.commonmark.org/0.14/#atx-headers

If the spec didn't mention it, then it would be fine. But the whole point of CommonMark is to tidy up issues related to lack of detail in the original spec.

I know changes like this could potentially break lots of people's content, hence the idea for the toggle. This forgiving change actually makes Parsedown less CommonMark compatible.

rhukster commented 9 years ago

Here's an even better CommonMark spec quote:

A space is required between the # characters and the header’s contents. Note that many implementations currently do not require the space. However, the space was required by the original ATX implementation, and it helps prevent things like the following from being parsed as headers:

naNuke commented 9 years ago

Why is this an issue that needs to be solved by parsedown, if the goal is to follow commonmark, then it should, users who cant spend time to fix their markdown files and add spaces after # (which isnt that hard) can just extend the parser and remove that rule?

erusev commented 9 years ago

@rhukster

I agree. I believe that Parsedown can be both tolerant to existing Markdown texts and strict regarding rules like the one we discuss here. Hopefully, the toggle in question would not require adding too much complexity.

bastianallgeier commented 9 years ago

@erusev if the toggle is easy to add and doesn't add much complexity, it's probably the best way to please us all :) Personally I think the power of Markdown has always been the focus on writers — even those without a technical background. The specs are a great tool to tidy stuff, but in the end it's all about the users/writers.

erusev commented 9 years ago

@naNuke

I believe that before Parsedown introduces changes that could break existing texts, it should at least provide that extension that would make the existing texts work. I'm in favour of compliance, but I believe that it should be done without breaking things. I hope that you'd agree.

naNuke commented 9 years ago

@erusev

I completely agree, as long as compliance is not given up in favor of supporting every single possibility of different markup style (which can lead to more problems than it is worth), thats what the spec is for anyway. Maybe in future, things like this could be unified under single "strict" setter, where strict parsing would mean fast and reliable while "non-strict" might be slower and possibly broken in some edge-cases.

erusev commented 9 years ago

@naNuke

I'm glad to hear that. A "strict" setter sounds good.

rhukster commented 9 years ago

I like it :+1: