xijo / reverse_markdown

Ruby gem to convert html into markdown
Do What The F*ck You Want To Public License
627 stars 118 forks source link

Hyphen(-) issue #62

Open cainianhua opened 8 years ago

cainianhua commented 8 years ago

Hyphen(-) in markdown should be show as a list, i think we should escape hyphen(-) to "-", otherwise after converted, hyphen(-) will be show as a list, but i expected only a normal hyphen(-), not list.

xijo commented 8 years ago

Hello @cainianhua, thanks for your report.

I'm not sure I got your problem right. Can you describe it with some code examples please?

For me the following looks like the expected behavior:

> ReverseMarkdown.convert 'foo-bar' => "foo-bar\n\n"
> ReverseMarkdown.convert '<li>foo-bar</li>' => "- foo-bar\n"

Don't you agree?

cainianhua commented 8 years ago

Sorry for that, i lost a important condition, Hyphen(-) should be in the beginning, like that:

ReverseMarkdown.convert('-test') #=> -test\n

I expected the result should be \\-test\n in this case, otherwise, it will show in page like this:

<ul>
    <li>test</li>
</ul>

But I only want to show as original value, like that:

-test

Thanks for your helps.

xijo commented 8 years ago

@cainianhua That sounds more like a problem of the markdown implementation you're using. In Redcarpet (a very common ruby implementation) the following works quite like expected:

irb(main):004:0> Redcarpet::Markdown.new(Redcarpet::Render::HTML).render('-test')
=> "<p>-test</p>\n"
cainianhua commented 8 years ago

Maybe you misunderstood my meanings, the codes you showed is making markdown to html. Actually, I just have a html fragment, like this:

<div><p>-test</p></div>

It displays the words of -test in screen.

After called convert to markdown:

ReverseMarkdown.convert('<div><p>-test</p></div>') #=> -test\n\n

It will display as list, like that:

<ul>
    <li>test</li>
</ul>

Not the words of -test, they are not equivalent, I only want to show this case, not sure is a problem/issue.

xijo commented 8 years ago

Hello @cainianhua

I understand your example, but I think you're jumping to the wrong conclusion.

Which markdown converter do you use to transform -test\n\n back into HTML?

In Redcarpet (https://github.com/vmg/redcarpet) the result would be what you expected: <p>-test</p>\n

I think the markdown converter you're using has a bug and shouldn't create a list from the given string.

Do you agree with my analysis or am I missing something?

sdhull commented 5 years ago

Test string should be <div><p>- test</p></div> (with a space between - & test). This issue makes perfect sense to me.

xijo commented 5 years ago

Hi @sdhull,

please let me know what the expected result for the input you provided is.

You won't be able to get the exact HTML that you provided in a full round trip though. The <div> will always be lost, because markdown has no representation for it.

So from my point of view this conversion looks OK:

ReverseMarkdown.convert "<div><p>- test</p></div>"
=> "- test\n\n"

If the HTML contains markdown formatted content, like in this example the list item, this content will inevitably be transformed at some point. Even if we managed to spot those cases and escape them, markdown renderers would not care anyway:

md.render "\- foobar" # use redcarpet
=> "<ul>\n<li>foobar</li>\n</ul>\n"
sdhull commented 5 years ago

Ohhh I see. Well in that case, I'd suggest closing this issue.

It might be worth adding a "Limitations" section to the readme pointing out that if the HTML you're converting to markdown has markdown-formatted text (text that would be interpreted by markdown parsers as lists, headers, hr's, etc), then a "roundtrip" (eg converting from html to markdown back to html) will inevitably produce a different result than what you started with.

On a somewhat related tangential note, I'd love to be able to turn off escaping of node text altogether (considering it only escapes * & _, it cannot be considered complete and as you point out -- redcarpet ignores escaping in some circumstances anyway). But that's another issue. I could submit a PR for it even.

xijo commented 5 years ago

@sdhull Sounds good, I'd be happy to merge any PR to improve documentation.