SuperFromND / iguana

Golang tool for automatically creating Ikemen GO movelists.
MIT License
10 stars 0 forks source link

Rewrite tokenization to tokenize motions rather than individual inputs #6

Closed SuperFromND closed 1 year ago

SuperFromND commented 1 year ago

This is more of a memo than a proper bug report, but I wanted to leave a note to myself as for how to go about rewriting this.

Iguana's tokenization is a process where it takes a command (such as D, DF, F, a) and converts it into an internal intermediate form (to be exact, a string of characters called tokens such as 2,6,3,a). This works, but it runs into problems when combined with merge() in the case of multiple commands being assigned to the same move.

Because merge() works on a per-character basis, the result is that individual inputs get appended to eachother, and the results can look really messy.

Take, for example, this snippet from Warusaki3's CvS Juni:

[Command]
name = "アースダイレクト1"
command = ~F, D, B, U, x
time = 40

[Command]
name = "アースダイレクト1"
command = ~F, D, B, U, ~x
time = 40

[Command]
name = "アースダイレクト1"
command = ~D, B, U, F, x
time = 40

[Command]
name = "アースダイレクト1"
command = ~D, B, U, F, ~x
time = 40

[Command]
name = "アースダイレクト1"
command = ~B, U, F, D, x
time = 40

[Command]
name = "アースダイレクト1"
command = ~B, U, F, D, ~x
time = 40

[Command]
name = "アースダイレクト1"
command = ~U, F, D, B, x
time = 40

[Command]
name = "アースダイレクト1"
command = ~U, F, D, B, ~x
time = 40

[Command]
name = "アースダイレクト1"
command = ~F, U, B, D, x
time = 40

[Command]
name = "アースダイレクト1"
command = ~F, U, B, D, ~x
time = 40

[Command]
name = "アースダイレクト1"
command = ~D, F, U, B, x
time = 40

[Command]
name = "アースダイレクト1"
command = ~D, F, U, B, ~x
time = 40

[Command]
name = "アースダイレクト1"
command = ~B, D, F, U, x
time = 40

[Command]
name = "アースダイレクト1"
command = ~B, D, F, U, ~x
time = 40

[Command]
name = "アースダイレクト1"
command = ~U, B, D, F, x
time = 40

[Command]
name = "アースダイレクト1"
command = ~U, B, D, F, ~x
time = 40

; ...

[State -1, reppuken]
type = ChangeState
value = 1500
triggerall = !var(59)
triggerall = roundstate = 2
triggerall = Command = "アースダイレクト1" || Command = "アースダイレクト2" || Command = "アースダイレクト3"
; ...

All of those above アースダイレクト1 commands are variations of the 360 motion (2684 in numpad notation), and logically one would just compress all of them down to "do a circle motion then press a button". merge(), however, goes on a per-character basis, and being met with all of those variations (along with the other two sets I've removed from the above snippet) decides instead to concatenate them all together into this hilariously unweildy mess:

reppuken            _F_+_D_+_D_+_B_+_B_+_U_+_U_+_F_+_F_+_D_+_D_+_B_+_B_+_U_+_U_D_+_B_+_B_+_U_+_U_+_F_+_F_+_U_+_U_+_F_+_F_+_D_+_D_+_B_+_XBB_+_U_+_U_+_F_+_F_+_D_+_D_+_B_+_B_+_U_+_U_+_F_+_F_+_D_+_D_U_+_F_+_F_+_D_+_D_+_B_+_B_+_D_+_D_+_B_+_B_+_U_+_U_+_F_+_F^X

Clearly, this isn't ideal! The solution that IGOCLG used (and that I want to reimplement into Iguana) is to instead tokenize motions. That way, instead of combining every input, Iguana would instead combine every motion, which should result in WAY less absurd results.

I think the best way to approach this from a coding standpoint would be to use Go's strings.Split function to split at every comma, analyze each splice, build together a new string of tokens, and then check for motions before returning.

SuperFromND commented 1 year ago

Okay, so it appears that Juni's case is a bit more complicated as it uses variants of 360 motions that don't have 100%-equivalent glyphs in IKEMEN (namely 6248 and 4862). Despite this, I rewrote tokenize() anyways since it should be an improvement nonetheless!