Closed ncphillips closed 6 years ago
I have a fix that works, and that I will go with in my fork, but I'm not sure it's the right fix for the library.
It involves changing the signature of the open
and close
functions for marks to accept the node
.
*
characters have special meaning in Markdown, even when not at the start of a line, so I don't see how not escaping them is an option. Also, the CommonMark parser should parse \*
to become *
, so the escaping should be harmless.
The issue where it escapes things in inline code was a different problem, which I've fixed in eff7162. I do not understand what the problem in tables is.
Could you start by describing the problem concretely, not just with examples? Can you show it breaking in this demo?
Markdown serializer's don't allow you to write Markdown inside of table related tags, so the asterisks do not need to be serializer.
Steps to reproduce in that demo:
<table><tr><td>Example *</td></tr></table>
into the wysiwyg<table><tr><td>Example \*</td></tr></table>
The text output by the wysiwyg renders to:
Example \* |
Are you pasting that as HTML or as text? In any case, I'm seeing the markdown parsing unescape the asterisk for both, so the backslash doesn't show up in the wysiwyg.
I'm pasting the HTML string <table><tr><td>Example *</td></tr></table>
into the wysiwyg editor, and the asterisks is escaped is in the raw editor:
The asterisk should not be escaped. When it is, the \
is visible in the built site.
This is the case for the markdown parsers for Github, Hugo, Jekyll, and VuePress.
The asterisk should not be escaped.
I think that's debatable.
When it is, the \ is visible in the built site.
I think that counts as a bug (or at least, a CommonMark non-conformance) in your Markdown renderer.
I suppose that could be true. Unfortunately, I have not found a Markdown renderer that does conform to that practice, and our users won't accept that as an answer. I'm okay with closing this PR, as we have a fork of the serializer that handles the case.
markdown-it is a good CommonMark implementation in JS (also drives the parser in prosemirror-markdown). Not sure about other languages. Going to close this, then.
Unfortunately I don't have control over which markdown parser our client's use. We just output the Markdown, and they use Jekyll/Hugo/VuePress/etc. to generate the HTML.
That said, markdown-it also appears to not want asterisks to be escaped when inside of table cells:
Ahh, I see what's going on now. prosemirror-markdown doesn't support inline HTML, since there's no way to represent that in the basic schema, but standard Markdown/CommonMark will treat that asterisk as being inside of a tag, and thus treat it as HTML, not Markdown text.
The proper solution would, I suppose, be to add inline and block HTML tags to your schema, and extend your markdown serializer and parser to handle them. But that's not quite trivial, since you also have to find a solution for the way users will edit such nodes in the wysiwyg view (using a node view with an embedded editor, maybe).
Since introducing the prosemirror editor to Forestry.io, we have been super happy. There has been a few illusive bugs around the
*
character. This PR introduces the fix for those bugsBug Fixed Examples
When in Text Node contains an HTML Table
This was the troublesome case that brought this bug to our attention:
Output was:
Output now:
When marked up as
code
Output was:
*foo
\*
Output now:
*foo
*
Non-Regression Examples
The following examples are changes in output from prosemirror-markdown, but not changes in intended behaviour for the output site.
Inside text node but after a new line
Prosemirror Node:
Output was:
Test ***
Output Now:
Test ***
Bad Horizontal Rules
Output was:
--*
Output now:
--*
Problems with this solution
I believe this solution fixes more bugs then it creates. It does however create one bug for sure.
This is frustrating, but right now we think this is the more acceptable bug for Forestry.
* at end of strong text
Output was:
My Text *
Output now:
**My Text ***
Future Work
Along the lines of #16, I think the ideal solution to the problems introduced by this PR would be to handle those edge cases by switching the opening and closing characters depending on the circumstance.
For example: If the bold text starts or ends with
*
then use__
instead.I am currently trying to change our markdown tokens to work this way.