RobertDober / earmark_parser

The Markdown to AST part of Earmark.
Apache License 2.0
68 stars 26 forks source link

Treat - as regular text unless it's in a title #24

Closed sionide21 closed 4 years ago

sionide21 commented 4 years ago

This fixes a bug where "--" parses and an empty paragraph with a warning.

The "--" case is pretty well agreed upon as regular text

The trickier one is the "-" case. It's just about tied between whether it should be regular text or an empty list. I made it regular text so I could get a PR up and discussion started, but am open to making it an empty list if that would be better.

sionide21 commented 4 years ago

I excluded "=" from being handled the same way because of this test, but I'm not sure they should be treated differently. Is that actually a case that should fail? There doesn't seem to be consensus what should happen.

RobertDober commented 4 years ago

First of all thank you for such a well documented and motivated PR.

Dashes

I agree with the -- case and would personally also agree on the - case, however, and notwithstanding that it is a major suffering in a lower part, backwards oriented part of the human body....

... I am trying hard to get GFM compliance (e.g. #9), because we (ex_doc and YHS) feel that this is what folks expect nowadays.

However I am all pragmatic about such edge cases and do not even know what an empty list is good for (getting a bullet w/o an HTML Entity, utf/8 characher)?

So please feel free to handle the single dash as you please (I might change it back later, depending on feedback)

Equal Sign

I have inherited this test from Dave, never given it a thought

Given this https://babelmark.github.io/?text=A%0AB%0A%3D maybe it makes sense to issue an error after all.

Is there any value in treating = as regular text?

sionide21 commented 4 years ago

I don't have a use case for the = at all so I'm totally fine leaving it. I really only noticed it because I had to specifically add level: 2 to my pattern match which felt a bit odd to make a special case for one and not the other.

If we're trying to follow GFM, they call a single - a list with one empty item. It also appears from their spec that giving precedence to the block operator is part of their philosophy which is consistent with making this a list.

RobertDober commented 4 years ago

Ok does this mean you want do change your PR before merging? If not I'll take a closer look ASAP.

Thanks again for your support

sionide21 commented 4 years ago

I'll try and get it rendering as a list this evening.

sionide21 commented 4 years ago

I'm working on making it a list and the code is starting to get pretty nasty. Since this is such an edge case, I'm tempted to leave it as is until someone complains.

RobertDober commented 4 years ago

Great @sionide21 I"ll merge it then

RobertDober commented 4 years ago

I really only noticed it because I had to specifically add level: 2 to my pattern match which felt a bit odd to make a special case for one and not the other.

Well I guess it is ok as the scanner surely does not have the context of knowing how a = or - will be used semantically what might have puzzled you, and I always, have to think of the regretted, late Jim Weirich, when talking about this issue:

The token is just miss-named, as we do not know if it will eventually be a Setext or not.

So you are completely right that the discrimination in the pattern match should be on hypothetical tokens Line.Dashes and Line.Equals

But please be aware that Earmark was created in an incredible short time by an incredible gifted man who was (and is) incredibly busy with many things,...