Open SpookySquidward opened 3 years ago
So for hyphens (-), Clone Hero will substitute any equals sign characters (=) with proper hyphens. This most likely is due to hyphens having actual gameplay effects in other games like Rock Band etc. Technically it has nothing to do with the actual .chart format on that one.
Current solution I'm running with is to add an export option to swap out these special chars for the correct CH equivalent as well as to save the chart in a different format in the case that a lyric or an event isn't compatible with the .chart format and to warn users to finalise it via the exporter to make it playable.
Moonscraper does technically support unicode already, it's just an issue with the font/font atlus being used not having an actual glyph registered. Supporting every possible glyph is gonna be too much work to really be worth it, but it still technically writes everything into the save files correctly.
old issue, but I still wish to add info:
Hyphens aren't the only character that get stripped out in CH, but it doesn't strip them out for no reason. Either they're used in Rock Band charts and need to be stripped out in order to display those charts correctly, or they have some other function.
-
indicate that a syllable should be combined with the next.+
will connect the previous note and the current note into a slide.=
indicate that a syllable should be joined with the next using a literal hyphen.#
are the basic non-pitched symbol.^
have a more generous scoring, typically used on short syllables or syllables without sharp attacks.*
are also used for them in some cases, their exact purpose is unknown.%
are a range divider marker for vocals parts with large octave ranges.§
are used in Spanish lyric authoring to indicate that two syllables are sung as one. These are replaced by a space in CH, and with a tie character ‿
in RB.$
are used in harmonies to tell the syllables they are part of to be hidden. They're stripped out because they were used on standard vocals in one chart for whatever reason./
have an unknown purpose, but they appear in some charts and must be stripped out, mainly The Beatles: Rock Band._
are replaced with a space by CH for the purpose of being able to have a character that does that.Other special characters work just fine.
Also, the CH Public Test Build doesn't strip out quotation marks in .chart like v.23 does. You can use them there and they'll show up properly. The PTB can also properly parse TextMeshPro formatting tags, though it strips out any that don't match a whitelist. This means that anything between <
angle brackets>
that does not match this list will be stripped out, including the brackets.
(This issue is in response to pull request #43)
The Value of Special Characters
In Moonscraper and the .chart file type, certain characters should not currently be used in global or track events. These most notably include the hyphen (-) and plain quotation marks ("), though I will also mention Unicode characters later. These characters are illegal because of how a lyric event is stored in the .chart file format:
<lyric tick> = E "lyric hello!"
Or more commonly, with syllables separated:
The quotation marks are used to encapsulate the content of the events, which in this case are
lyric Hel-
andlyric lo!
; they are always removed by Moonscraper when importing a .chart file. And the hyphens are used to separate syllables in a word, as inlyric Hel-
; they are removed by the program playing a .chart (or .mid) file and are always removed by Clone Hero. If you want to use them in your lyrics to add style, currently you need to find a workaround, like using two apostrophes ('') instead of proper quotation marks; or, you can simply avoid using these characters altogether.Potential Solutions
1. Parse More Intelligently
Take, for example, the following lyric event:
960 = E "lyric "What?""
We want our lyric event do display as
"What?"
, but instead we simply getWhat?
. Where did our quotes go? Well, we need to remember that quotation marks are automatically removed when our chart file is read. So our reading program (such as Clone Hero) only sees this:960 = E lyric What?
It is not surprising, then, that our quotes have disappeared. Notice, however, that we have enough information to figure out exactly where we should and should not have quotation marks. Let's simply remove the first and last quotation mark in our event:
960 = E "lyric "What?""
becomes960 = E lyric "What?"
We can break our line down into parts (
960
,=
,E
,lyric
, and"What?"
), and we will have our event back just as we typed it.A similar process can be used for hyphens, where we only remove the last hyphen in a lyric event:
can give us the syllables
out-
,of-
,this-
, andworld
without losing the information that these events are syllables.Technical Note
Quotation marks are actually legal characters in Clone Hero in the .mid file format, and they are correctly saved to both the .chart and .mid formats from Moonscraper. However, if you reopen a .chart file that has extra quotation marks, they will be removed by Moonscraper and will need to be retyped. Hyphens are also saved correctly to the .chart and .mid formats, but they are currently removed in Clone Hero.2. Add Escape Characters
Escape characters give us another solution to this problem by letting us explicitly say that we want to use a character as-is. In C#, for example, the backslash () is an escape character.
\"
gets interpreted as an apostrophe,\n
is a newline character, and\\
codes for the backslash itself. We could use a similar system in the .chart specification, where\-
would code for a dash,\"
for a quotation mark, and\\
for a backslash. This is a more robust solution than character substitution, such as using=
to mean-
; we can now use hyphens in our lyrics, but we can't use equal signs, so we have just shifted the problem.3. Use Other Characters Instead
This is not one of my preferred solutions, but it is worth mentioning because it's what many charters are currently doing. Instead of using plain quotation marks, we can instead using opening (“, U+201C) and closing (”U+201D) quotation marks to accomplish a similar style. For hyphens, we can substitute the similar en dash (–, U+2013). These characters are distinct from the hyphen (-, U+002D) and plain quotation marks (", U+0022), so they aren't parsed out when reading the file and should display as-expected.
Not only does this approach still prevent you from using the original characters you wanted to, it also assumes that every program that will read a chart file understands Unicode characters; if a program doesn't accept Unicode characters, the effects can range from unexpected to totally game-breaking. Moonscraper doesn't support Unicode fully, as shown below:
While Unicode could reasonably be implemented into newer applications, it is not likely to make its way into old code. A Unicode implementation would also be incomplete without giving its characters more real use by supporting multiple languages for a song, a feature that is beyond the scope of this issue.
Verdict
I am of the opinion that Moonscraper will eventually need to support special characters better than it currently does. While an implementation of any of the above solutions comes with its challenges, it will be better for the community in the long run if special characters become officially supported in Moonscraper and the .chart file format.