mmikeww / AHK-v2-script-converter

AHK v1 -> v2 script converter
https://autohotkey.com/boards/viewtopic.php?f=6&t=25100
The Unlicense
539 stars 42 forks source link

RegExMatch output variable causes incorrect transformations #118

Open Lexikos opened 1 year ago

Lexikos commented 1 year ago

Input:

        if RegExMatch(uri, "^\[url=")
            RegExMatch(uri, "\G[^\]]*", uri, 6)
        else
        {
            MsgBox 1,, URI appears invalid:`n%uri%
            IfMsgBox Cancel
                return
        }

Incorrect output:

        if RegExMatch(uri, "^\[url=")
            RegExMatch(uri[0], "\G[^\]]*", &uri, 6)
        else
        {
            msgResult := MsgBox("uri[0] appears invalid:`n" uri[0], "", 1)
            if (msgResult = "Cancel")
                return
        }

There are multiple issues:

I think a simpler, less error-prone approach would be for uri := uri && uri[0] to be inserted after the RegExMatch call. Appending && uri := uri[0] to the call would also work if the return value is not being stored.

Lexikos commented 1 year ago

Properties with the same name are also erroneously transformed.

Input:

RegExMatch(haystack, regex, output)
x := task.output

Incorrect output:

RegExMatch(haystack, regex, &output)
x := task.output[0]
Lexikos commented 1 year ago

Unrelated variables and parameter names elsewhere in the script (perhaps hundreds of lines away?) are also transformed.

Input:

a(s, r) {
    RegExMatch(s, r, m)
    return m
}
b(m) {
    return m
}

Incorrect output:

a(s, r) {
    RegExMatch(s, r, &m)
    return m[0]
}
b(m[0]) {
    return m[0]
}
safetycar commented 1 year ago

For the maintainers, I will attempt to cleanup and clarify parts about _RegExMatch and ConvertPseudoArray. I'll write back about things that might require more changes.

mmikeww commented 1 year ago

For the maintainers, I will attempt to cleanup and clarify parts about _RegExMatch and ConvertPseudoArray. I'll write back about things that might require more changes.

Thanks for your note, its helpful so people don't do duplicate work. If you work in a branch on your own repo, you can just tag this issue and then we should see a notification here.

Remember to add the associated test cases. You could just use the examples provided by Lexikos

safetycar commented 1 year ago

The current way of updating the names is a simple replacement on a full line. I see it difficult to improve much without some bigger changes.

The changes made are:

But sadly:

To fix those it seems better to try to find a better approach than the current full line replacement, instead of creating more workarounds.

mmikeww commented 1 year ago

partially completed in 69269b3edfdd482e1fe96547f694c25893d9344f

andymbody commented 3 months ago

Unrelated variables and parameter names elsewhere in the script (perhaps hundreds of lines away?) are also transformed.

Input:

a(s, r) {
    RegExMatch(s, r, m)
    return m
}
b(m) {
    return m
}

Incorrect output:

a(s, r) {
    RegExMatch(s, r, &m)
    return m[0]
}
b(m[0]) {
    return m[0]
}

This will be corrected within Masking pull request coming soon (already tested and corrected).

andymbody commented 3 months ago

I think PR #208 will have some affect on this issue. It is still not the full fix, but progress. More work will be needed to address the global scope of the issue outlined in this thread. Possibly a rewrite of how RegexMatch (and other structures are) is handled in general, rather than multiple band-aids (as @safetycar mentioned). The upcoming PR that will introduce class/function/string masking (and separate handling) may come in handy for this as well. It can be extended to include other structures such as IF, Regex, etc, which may provide alternate fronts of attack.

See commit #213 for partial solution to this thread