EgonOlsen71 / basicv2

A Commodore (CBM) BASIC V2 interpreter/compiler written in Java
https://egonolsen71.github.io/basicv2/
The Unlicense
84 stars 15 forks source link

C64 crashes with blue screen #65

Closed viteisele closed 5 months ago

viteisele commented 5 months ago

I adapted the Python program from Prof. Weitz for finding the Superzahl (see https://www.youtube.com/watch?v=kO91PkkkD0o at 13:39) to C64 Basic.

superzahl_weitz.txt

Running with C64 it shows the correct result for n=6 (123654 and 321654), but it is very slow (it tries all permutations and uses the power function). Therefore I compiled it with MOSpeed and started it with C64 Emulation (C64 JavaScript Emulator and CCS64). But some seconds after finding the first number the emulated C64 crashes with a blank blue screen.

With a little modification the compiled program runs without crash (btw: a additional int() is necessary since there is a problem with the precision of the power function in Commodore Basic): 1310 gosub 1400: if z-int(z/i)i then return 1410 z=int(d(i-i2+1)10^(i2-1)+z)

I was surprised about the crash because I compiled many optimized versions of the program and the code was always stable. The bbcompiler causes more trouble.

EgonOlsen71 commented 5 months ago

I'll have a look...

EgonOlsen71 commented 5 months ago

What an interesting issue. It was caused by me not correctly setting a rounding flag in some cases. This must have been in the code for almost forever, but it never caused an issue before. Here, it somehow lead to a comparison of 6=6 being false which in turn let to an array out of bounds error (which the runtime doesn't check for) which in turn filled up the whole RAM with some values finally overwriting the VIC II registers, which lead to the blue screen. Should be fixed now. BTW: While compiling the code somehow helps, it greatly suffers from the "power of..." calculation in line 1410, which is slow no matter what you do.

EgonOlsen71 commented 5 months ago

As a side note...I made a faster version. The interpreted version doesn't benefit that much, because this version introduces a lot of float/int conversions to it. But the compiled one benefits greatly.

sz.zip

viteisele commented 5 months ago

Thank you for the quick correction. I expected that you need longer. Where there really only three hours between taking a look to it and releasing the fix ? Perhaps you can make a video for the youtube channel EORetro ? A title could be "Blue Screen mit C64".

I checked your optimized basic program. But the most performance improvement comes from storing the results of 10^x in an array. But simpler and a little faster is: 1310 gosub 1400: if z-int(z/i)i then return 1400 z=d%(1): i2%=2 1410 z=10z+d%(i2%) 1420 i2%=i2%+1: if i2%<=i goto 1410 1430 return

Much faster is when the three subroutines permut, issuper, initial are combined and the number of permutations is reduced. Then it is possible to calculate n=9 without waiting a long time:

superzahl_weitz9.txt

This version needs in the compiled version only 4s. My fastest version with the help of MOSpeed needs 0.65s.

Can you beat it ? ;-)

EgonOlsen71 commented 5 months ago

I don't think so...this looks pretty much as simple as it gets to me. But I have to admit, that I didn't look at the algorithm itself. I just tried to optimize how it was doing things without having a clue about what that actually is, that it's doing. I still have a feeling that it should somehow be possible to optimize this:

if z-int(z/k%)*k% goto 1250

...but I'm not sure...

But yes, it took me only these 3h. To be precise, there was an hour of cycling involved, so it actually took me 2h. I had it tracked down rather quickly and when I looked at the code, it was obvious to me that I tried to do the right thing but simply didn't. It doesn't always go this quickly...:-)

EgonOlsen71 commented 5 months ago

I've updated the compiler again. I've noticed that an optimization that should have been triggered by 10*z for some bizarre reason didn't. It's fixed now, so the result is slightly below 4s with that properly working.

EgonOlsen71 commented 5 months ago

3.4s by avoiding one integer->float conversion and one array access.

supiizahl.txt

viteisele commented 5 months ago

MOSpeed confuses me: this should be faster (one assignment fewer in line 1230)

supiizahl2.txt

With Commodore Basic this is true, but not with MOSpeed (3.55 s)

EgonOlsen71 commented 5 months ago

Yes, I tried that variant as well and noticed that it's slightly slower. I guess the optimizer optimizes this version somehow less than the other one. I don't consider this to be a problem, but I'll have a look to see what's different anyway.

EgonOlsen71 commented 5 months ago

The generated code for d%(k%)=i% is using integers only while the code for d%(k%)=d%(t%) was using an intermediate float value which got converted back and forth. It did so explicitly, because there's a comment in the code that this is "...still semi-optimal...but anyway". I remember dimly, that I couldn't do any better at that stage in the optimizing process but I obviously didn't notice (or didn't bother...) that I could do it later. So I now did... Now, both versions perform the same, around 3.2s. That's because the time has been spend on the conversion not on the actual assignment.

viteisele commented 5 months ago

The optimisation of MOSpeed seems to be very sophisticated.

I combined 3 lines to one long line => 3.05s:

supiizahl2o.txt

With the first program in this issue with n=9 Commodore Basic needs 1,4 days (!!) and MOSpeed version nearly 19h for the 362880 permutations with 594095 calls of the subroutine initial. With 3.05s the last version is more than 39000 times (resp. 22000 times) faster. So it is nearly so fast than the Python program on PC of Prof. Weitz with 64-bit prozessor, FPU and some GHz. Not bad for a 40 year old C64 (8-bit processor and 1 MHz) and our optimization :-)

Two additional remarks:

  1. In this case MOSpeed is appr. 1s faster than bbcompiler with options (4.167s). Congratulations !
  2. When looking whether I can optimize my fastest version I saw a bug in my code. After fixing it my fastest version is now slower with 1.03s.