EgonOlsen71 / basicv2

A Commodore (CBM) BASIC V2 interpreter/compiler written in Java
https://egonolsen71.github.io/basicv2/
The Unlicense
84 stars 15 forks source link

Code with integers is slower than with real variables #66

Closed viteisele closed 4 months ago

viteisele commented 5 months ago

Another variant of the basic program for finding the Superzahl of Prof. Weitz (see issue #65):

superzahl_permi93_6r.txt

In issue #65 the performance was improved by replacing real variables by integer. This I applied also to this program:

superzahl_permi93_6i.txt

But then the compiled program is slower: instead of 0,15s it needs now 0,167s. With bbcompiler it is as expected with same files: instead 0,15s it needs 0,1s.

EgonOlsen71 commented 5 months ago

That's most likely because of this calculation:

i%-int(i%/3)*3

BASIC BOSS (bbcompiler is the old BASIC BOSS compiler disassembled and compiled to C with some added runtime) treats calculations with only integers completely as integer, which is actually wrong. MOSpeed only does so, if and when it can be sure that it is safe to do. Take this code for example:

10 a=3000:a%=3000
20 print a*20/30
30 print a%*20/30

The intepreter (and MOSpeed) will print out 2000 both times. BASIC BOSS (and in turn bbcompiler) will print out 2000 and -184.

That's why MOSpeed will convert the i% to float and then does the calculation with floats only. The conversion adds some overhead, which might explain why it is slower than with floats alone. One could argue that it IS safe to do it in integer in this particular case, but MOSpeed can't possibly know, so it reverts to the safe way of doing it.

Not sure if that's all that's going on here. Maybe there's some room for improvement and I'll look into it, but I'm pretty sure that this is the main reason for the observed behaviour.

EgonOlsen71 commented 5 months ago

So in short: BASIC BOSS calculates wrongly, but in most cases, it doesn't matter. MOSpeed doesn't do this, but might be slower in these cases.

EgonOlsen71 commented 5 months ago

I did some minor tweaks to the optimizer, but nothing to write home about. However, I made a version which replaces the d%-array with memory access (aka PEEK/POKE), because arrays are usually slower. I also splitted some calculations to help the optimizer a little bit, because, as mentioned above, it only applies some optimizations when it can be sure that no intermediate results excceds the integer range.

superzahlii.txt

With this, all the former array access code gets down to simple read/write assembly calls like

LDY 836
STY 838

...it doesn't get any faster than this. So we gain enough at these parts of the code to compensate for the more expensive calculations somewhat and so, this version is down to 0.1s with MOSpeed as well. Please also note, that we are so fast now, that we are encountering inaccuracies in the time measurement. TI simply doesn't update fast enough, so depending on when some interrupt kicks in (or not), the time fluctuates between 0.1 and 0.1167.

viteisele commented 5 months ago

Thank you for the explanations about integer optimization. Till now I had this problem only with the compiler Quickbasic 1.0 on PC, but not with bbcompiler on C64. With bbcompiler there is a different behavior between assignments and print command, see below.

Your trick with peek and poke is very interesting and creative, but there are two disadvantages

  1. the code becomes unreadable
  2. there exists are simpler and faster solution with 0,0667s:

superzahlii2.txt

Instead of an array I used normal scalar variables and this seems to be faster than the peeks and pokes. But where arrays are really needed, the trick is good :-)

For bbcompiler I have to modify line 1200. First it was 1200 print 10(10(10(10(10*d1%+d2%)+d3%)+d4%)+d5%)+d6% This code was faster, but the result was negative and wrong. With a additional assignment to a real variable the result was correct.

Perhaps you can write some hints how to take advantage with integer, e. g. it seems that you have optimized the addition/assignment with two operators and probably logical operations (and, or) are always faster (there can be no overflow).

EgonOlsen71 commented 5 months ago

Nice work...!

An addition (or subtraction) of two integers can be optimized, if you can assure that the result is within the 16 bit integer range as well. With something like A%+B%, this is the case if either A% (or B%) is negative or none of the operators is > 2^14. So MOSpeed redirects these calculations to a special add/sub-method that checks, if these conditions are met and if they are, the calculation is being performance in integer arithmetic. If not, it reverts to float. BASIC BOSS doesn't do this, always uses integers and so, depending on the input values, might produce wrong results. If the target variable is integer as well, you can improve this optimization by not converting the result back into a float at all. That's not the case when you do something like Z%=A%+B%-C%, because the intermediate results are allowed to exceed the integer range as long as the end result still fits (another thing that BASIC BOSS simply ignores). That's why splitting the calculations is usually faster.

EgonOlsen71 commented 4 months ago

I'm closing this one, because it's not an actual issue. The integer optimization is constantly improving (especially by challanging tasks as do them, so keep the questions/tests coming), so I'm not sure how much sense it would make to document the way it works. To be honest, there are side-effect that even I don't oversee...:-)

viteisele commented 4 months ago

Thank you for the hints for optimizing integer operations.

Now my fastest version for finding SUPERZAHL with n=9 is 0,767s with MOSpeed :-)