Sei-Lisa / LSL-PyOptimizer

Optimizes a LSL2 script, folding constants, removing unused code and more. Adds new syntax features too.
GNU General Public License v3.0
35 stars 11 forks source link

Subtraction replaced by addition of a negative value #24

Closed KrsityKu closed 1 year ago

KrsityKu commented 1 year ago

Simplified repro using with Constant folding, and DCR:

integer gPrevious;

default
{
    touch_end(integer num)
    {
        integer now = llGetUnixTime();
        if (now - gPrevious > 15)
        {
            gPrevious = now;
            llOwnerSay("Touch");
        }
    }
}

Results in the following optimized code:

integer gPrevious;

default
{
    touch_end(integer num)
    {
        integer now = llGetUnixTime();
        if (15 < now + -gPrevious)
        {
            gPrevious = now;
            llOwnerSay("Touch");
        }
    }
}

I'm not sure if the conversion of "if (now - gPrevious > 15)" to "if (15 < now + -gPrevious)" is correct, because now there is + and - which each take up a byte each, don't they?

Also, I would excpect the + - to end up being two operations at the lowest level: negation and addition, compared to - just being a single subtraction, which should be faster.

Or is there some weird LSL quirk I am not aware of?

Sei-Lisa commented 1 year ago

Or is there some weird LSL quirk I am not aware of?

I'm afraid so. Subtraction needs a function call and that's five bytes, while addition and negation only take one byte each, so the net gain of using addition + negation vs subtraction is three bytes.

The reason LL used a function call for subtraction is probably because the parameters are pushed in reverse order with respect to the one the subtraction operator expects. That's a quirk of LSL: in binary operations, the right operand is evaluated and pushed to the stack before the left operand.

KrsityKu commented 1 year ago

Does that also mean that the memory is actually counted on some sort of hidden "compiled" version on the LSL code, rather than the plain text?

Sei-Lisa commented 1 year ago

Yes, exactly. In Mono, the code size is measured as the size of the compiled executable, which is a CIL executable. The optimizer only deals with Mono; generally it won't shorten LSO programs. The data size is accounted for separately, using an estimation based on the size of the variables; there's info about data memory usage in the wiki. The total size reported by llGetUsedMemory() is the code size + the data size.

There was a time where the Mono compilation was client-side. It didn't last long, but during some time the compiler was part of the viewer, and since the viewer code is public, we have the compiler. I was able to isolate the Mono compilation part in a standalone executable. I've made that standalone compiler public. You can check it out here:

https://github.com/Sei-Lisa/LSL-compiler

That has allowed me to study how the source is converted to executable code, and to devise strategies to produce shorter code, for example the conversion of subtraction to negation + addition. You can compile it yourself if you're brave and curious enough :)

It's not the whole story, though. After compilation, a code injector inserts code to deal with the saving and restoring of global and local variables and stack status, as appropriate, for the purpose of saving the state and resuming afterwards (e.g. when the sim restarts, or when teleporting). We don't have the source for that code injector, but there's an online article about it, and I've had access to Mono compiled binaries which I have disassembled to look at the executable code generated, which has allowed me to understand how it differs from the viewer version, and to apply more optimizations.

For example, saving/restoration code is injected at every library function call. When the optimizer sees an expression of the form llGetLinkNumber() + 5, it swaps the operands if possible so that the function call is a right operand, because as I mentioned, right hand operands are pushed first. The intention is that the stack is as empty as possible at the time of the function call, because the fuller the stack, the more code it has to inject in order to save/restore it. In the original order, the 5 gets pushed before the function is executed, and when the injection happens due to the library function call, code to restore the 5 in the stack needs to be injected as well. But when the code is written as 5 + llGetLinkNumber(), as the optimizer does, the function will be executed first, therefore the 5 is not yet on the stack, so it doesn't need to be saved and the injected save/restore code is shorter. For the same reason, long sums of strings or lists are parenthesized right-to-left, so that there's at most one element in the stack at a time.

Hope this gives some insight on the optimizations that are performed.

KrsityKu commented 1 year ago

Oh god... what did I get myself into🥲

I'm guessing none of this is documented anywhere, as I've been just using SL Wiki for info, which only covers small parts of stuff like this.

Thanks for the info and all your hard work! Dead code removal and string concatenation combined with a preprocessor are really helping my sanity while scripting😄. I basically use them to double check if the code is being generated the way I expect it, and then do a second full optimization pass after testing for a 'release' version of scripts.

Closing as Won't Fix/Working as Intended

Sei-Lisa commented 1 year ago

There is some info in the wiki, the same that prompted me to investigate more about how these results were obtained and how to get the most out of them:

https://wiki.secondlife.com/wiki/User:Becky_Pippen/LSL_Performance

https://wiki.secondlife.com/wiki/User:Omei_Qunhua#Mono_Code_Size_Measurements

https://wiki.secondlife.com/wiki/User:Pedro_Oval/Mono_code_memory_usage

https://wiki.secondlife.com/wiki/User:Pedro_Oval/Mono_code_memory_usage/CIL

Both Becky and Pedro published CIL code, but neither of them made the compiler available, so I made my own and published it.

Then, a security hole in an LSL function (now patched) allowed me to access random parts of the RAM of the sim, which helped me obtain executable binaries for a few random scripts. I was lucky enough that one of those was one I had written, which let me understand more on the code injector.

As for data usage, there's this:

https://wiki.secondlife.com/wiki/LSL_Script_Memory

HTH

Sei-Lisa commented 1 year ago

Found the link to the discussion about the saving/loading injector, by Scouse Linden (Dave Hiller).

https://davehillier.net/2015/03/11/second-life-mono-internals/

In case the link goes dead, it's archived at:

http://web.archive.org/web/20221001120527/https://davehillier.net/2015/03/11/second-life-mono-internals/

(I leave it here for future reference)