Closed KamilaBorowska closed 6 years ago
Yes, this is already being worked on in chat.js
. We'll port it to the client once it's good enough.
I mean, the issues. We're not going to take Markdown format for various reasons, mainly that Markdown is not designed for end users, and has a lot of snags when dealing with end users (the biggest one relevant to us being that it makes ascii art impossible).
(the biggest one relevant to us being that it makes ascii art impossible)
Not sure how it is different to what currently is here, at least with Markdown you can sorta escape characters if you want if they end up being metacharacters.
Okay, that was unclear. Let me try again.
We will fix: Issues involving code in URLs and code
blocks not being escaped, issues involving nesting formatting being weird
We will not have: single-character formatting markers, like _text_
for italics
Markdown was designed for programmers. It was designed for everyone to read, but it was designed to be written by programmers, not the general public.
Reddit is the most infamous example of Markdown misuse, such as linebreaks not appearing, and also comments starting with things like "52." are automatically converted to "1." (because Markdown renumbers lists), or hashtags being converted to titles.
GitHub has Github Flavored Markdown which fixes some of these issues.
But ASCII art is still a problem. ¯\_(ツ)_/¯ still needs two escapes. And you should not expect users to know how to escape text.
Are multi-line code blocks being considered for this? They'd be helpful for techcode and dev so Hastebin and ilk wouldn't need to be used for short chunks of code that aren't one-liners
PS does not currently support multi-line messages, and changing that would be difficult, I think.
PS does accept multi-line commands, though - so we could make like, !code [code here that spans multiple lines]
edit; it could be suppressed by default, using <summary>
and <detail>
tags
I'm okay with !code
for a multi-line code block
Pull request for !code
: https://github.com/Zarel/Pokemon-Showdown/pull/3802
Really, this is why PS leaves single symbols alone - casual users are not going to know how to type the things they want to type. a*b*c = d
et al should really not be turned into italics or whatever.
CommonMark Example 333 is a good example of why I don't want Markdown: http://spec.commonmark.org/0.27/#example-333
5*6*78
<p>5<em>6</em>78</p>
Current chat message parser doesn't exactly deal well with edge cases so I'm interested in changing it to be more compatible with other Markdown implementations, specifically CommonMark specification.
The reason why I want to do so are:
Making it easier to explain in chat how to use chat formatting. For example, let's say somebody wants to explain how to write code (
``code``
).Currently this involves saying something like that:
When it would be great to just be able to say:
``code``
Avoiding accidental formatting. For instance, currently typing
``__proto__``
will cause unintendedproto
to appear. Yes, this will break formatting within code, I don't think it's a big issue myself - it's more annoying than helpful.Compatibility with Discord chat formatting. People may be used to how Discord formats stuff. Discord uses Markdown itself, so it would be great if the same formatting system could be used on Showdown so there would be no need to switch context.
Improving the situation with edge cases. Currently edge cases are unpredictable with how they will be parsed.
The Markdown implementation of chat message parser would support the following Markdown features.
**text**
)_text_
,*text*
)Code
(`Code`
,``Code``
and so on)\`
,\*
and so on)&
)<http://example.com>
,<example@example.com>
)>text
)There is no support for links, images and raw HTML. This is intentional, as those features would likely be misused. For instance, link text may be misleading compared to its actual location.
Additionally those non-standard Markdown features would be implemented:
__text__
)~~text~~
)^^text^^
)\\text\\
)[[text]]
)http://example.com/
)For specification of how this feature would work check out http://spec.commonmark.org/0.27/#inlines. Strikethrough and superscript are parsed in similar way to
*
character in that specification.[[
is parsed in similar way to[
in this specification - shortcut reference links section to be exact. Links without <> brackets will be parsed just like current implementation does, I don't see any issue with it.Subscript is rather tricky, but it will likely involve DWIM code whose purpose is to determine whether you wanted backslash escape or not, depending on whether backslash escape would be needed on not. Still need to figure out this part precisely (as far specification goes).
This is a proposal. My intent is to reduce number of incompatible changes as much as possible, but it's unavoidable that some edge cases will be parsed differently - after all one of reasons to do it is to make parser more consistent.
A new implementation should be quite fast as it would be based on a state machine (similar to how programming languages are parsed), parsing every character just once without backtracking. I don't see anything in CommonMark that would specifically prevent doing this in O(N) time