Off-by-one error in array indexing

schlatterbeck commented 2 years ago

As suggested in private email I'm trying to debug a basic program using the python debugger. When I use an array produced with DIM, I'm getting an off-by-one error when displaying the array using self._memory.arrays.to_list Consider the following test program:

1 DIM J2(5,2)
2 J2(1,1)=-1
3 J2(1,2)=-1
4 PRINT J2(1,1);J2(1,2)

When I run this, it correctly prints "-1 -1". But in the debugger I get:

python3 -m pdb =pcbasic t.bas --interface=none -d
(Pdb) break /usr/lib/python3/dist-packages/pcbasic/basic/interpreter.py:105, struct.unpack_from('<H', token, 2)[0] == 4
Breakpoint 1 at /usr/lib/python3/dist-packages/pcbasic/basic/interpreter.py:105
(Pdb) c
> /usr/lib/python3/dist-packages/pcbasic/basic/interpreter.py(105)parse()
-> if self.tron:
(Pdb) p self._memory.arrays.to_list (b'J2!')
[[0.0, 0.0], [0.0, -1.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0]]
(Pdb) c
-1 -1 
Ok

This sets a conditional breakpoint in the interpreter loop and makes it stop in the Basic program above in line 4. When I display the array using arrays.to_list the python list is off-by one: The values are in the position where the Basic 1-based index would expect them. But I cannot display the value at J2(1,2) in python. I don't know where the to_list method is used so I don't know if this bug has further problems down the road.

Note that the program I'm debugging uses fairly overDIMensioned arrays, so it's not an immediate problem (as long as I remember to use 1-based indexing when debugging in python)

Marrin commented 2 years ago

I'm not sure what you see as an off by one error here? PC-BASIC arrays are 0 based by default unless you use OPTION BASE 1.

schlatterbeck commented 2 years ago

On Tue, Jun 21, 2022 at 03:12:37AM -0700, Marc Rintsch wrote:

I'm not sure what you see as an off by one error here? PC-BASIC arrays are 0 based by default unless you use OPTION BASE 1.

Interesting, when I was doing something in Basic everything was 1-based. Why does my small example code work then? It does access the array J2 produced with DIM at a coordinate that should not exist?

DIM J2(5,2) ... J2(1,2)=-1

The index 2 in the second position should not be allowed if zero-based, then?

Note that I'm running an existing antenna-simulation program called mininec https://github.com/Kees-PA3KJ/MiniNec

which uses 1-based indexing without any option and works fine. I just cannot examine the last dimension of any array with self._memory.arrays.to_list

If the DIM above uses 0-based indexing which would make the array of size 6 X 3 then the self._memory.arrays.to_list at least does not list the last index of any dimension as can be seen from the example I've given.

Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: @.***

Marrin commented 2 years ago

DIM A(2) results in an array with 3 elements, with indices 0, 1, and 2.

From the documentation of DIM:

limit_0, limit_1, ... are numeric expressions that specify the greatest index allowed at that position.

This doesn't change whether OPTION BASE 0 or OPTION BASE 1 was executed.

schlatterbeck commented 2 years ago

I see. self._memory.arrays.to_list does not list the last dimension (the first index 6 and/or the second index 2) then. This means when using the python debugger to debug a Basic program I cannot inspect the last dimension of any array. I would expect

(Pdb) p self._memory.arrays.to_list (b'J2!')
[[0.0, 0.0, 0.0], [0.0, -1.0, -1.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]

Compare this to the output that is produced, the third dimension in the inner array and the 6th dimension in the outer array are missing.

Marrin commented 2 years ago

I'm using the last version published on the Python Package Index and get a completely empty list in the debugger. It looks a bit cumbersome anyway to set that breakpoint.

I would just extend the BASIC program with a subroutine writing the array contents to screen or into a file. Or define a custom BASIC command.

Marrin commented 2 years ago

It looks like the two range() calls in the following method should go one step further:

https://github.com/robhagemans/pcbasic/blob/0e0a758d7dca91daf6bfe8c6111218f46967a9ae/pcbasic/basic/memory/arrays.py#L305

Calling get() gives the correct values:

(Pdb) p self._memory.arrays.get(b'J2!', [1,1])
b'!'[b'00008081' -1.0]
(Pdb) p self._memory.arrays.get(b'J2!', [1,2])
b'!'[b'00008081' -1.0]

robhagemans commented 2 years ago

Yes, looks like a bug. Thanks!

On Tue, 21 Jun 2022, 12:23 Marc Rintsch, @.***> wrote:

It looks like the two range() calls in the following code should go one dimension further:

https://github.com/robhagemans/pcbasic/blob/0e0a758d7dca91daf6bfe8c6111218f46967a9ae/pcbasic/basic/memory/arrays.py#L305

Calling get() gives the correct values:

(Pdb) p self._memory.arrays.get(b'J2!', [1,1]) b'!'[b'00008081' -1.0] (Pdb) p self._memory.arrays.get(b'J2!', [1,2]) b'!'[b'00008081' -1.0]

— Reply to this email directly, view it on GitHub https://github.com/robhagemans/pcbasic/issues/182#issuecomment-1161613187, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB25RETWNIMLMGETQQJWH6LVQGQ2VANCNFSM5ZLRJMQQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

schlatterbeck commented 2 years ago

@Marrin: Thanks for the workaround how to display the values, I wasn't aware of that usage of the get method yet.

Concerning your comments on debugging: The program I'm debugging is numbered tightly. It's hard to extend and I would have to extend it differently for each aspect I'm debugging (I'm actually reverse-engineering the thing trying to rewrite it in python). So I beg to differ on the use of the debugger :-)

I'll write this up how I'm debugging a Basic program in the python debugger and submit it as a patch, you decide if you want to include it or not, but it might be useful for others. Maybe. :-)

Marrin commented 2 years ago

@schlatterbeck GW-BASIC/PC-BASIC has a RENUMber command.

I guess I would port it to QBasic first to get rid of the unnecessary line numbers and replace the remaining ones with names and then starting to replace GOTOs with IF/ENDIF, and the different loop constructs and then the GOSUBs by SUBs and FUNCTIONs before porting it to some modern language.

schlatterbeck commented 2 years ago

@Marrin: I know about the renumber. But I want to document my nascent python program with the original line number. And most important: I want to check my computation against the original Basic implementation. So porting to a different Basic dialect is out, too. See my patch in #183 for a description on how I'm debugging the program. Maybe you can appreciate some of the beauty of it :-) At least for me it is by far easier to use a familiar tool like pdb than trying to get back into Basic which I left when moving from my Apple ][ to the IBM PC :-)

Marrin commented 2 years ago

Sorry, I can't see any beauty in it and I think it is much more error prone to not convert it into some structured code first without any GOTOs. You'll need to do that in the end anyway.

QBasic is Microsofts next step after GW-BASIC and it is almost 100% backwards compatible, i.e. can run GW-BASIC programs unchanged. After removing unnecessary line numbers that are not a target of a GOTO, GOSUB, etc. one can eliminate GOTOs with multi line IF/ELIF/ELSE/ENDIF and loops. With proper indentation. The remaining line numbers than are subroutines/functions and can be put into SUB/FUNCTIONs. Those conversions should be straight forward, there's very little room for errors that change the meaning of the code.

What about extending BASIC with a _DUMP command?

#!/usr/bin/env python3
from pprint import pprint

import pcbasic

SOURCE = """\
1 DIM J2(5,2)
2 J2(1,1)=-1
3 J2(1,2)=-1:_DUMP("J2!()")
4 PRINT J2(1,1);J2(1,2)
"""

class Session(pcbasic.Session):
    def __init__(self, *args, **kwargs):
        pcbasic.Session.__init__(self, *args, extension=self, **kwargs)

    def dump(self, varname):
        print(varname, ":")
        pprint(self.get_variable(varname))

def main():
    with Session() as session:
        session.execute(SOURCE)
        session.execute("RUN")

if __name__ == "__main__":
    main()

Output (still with the bug in _to_list() of course):

b'J2!()' :
[[0.0, 0.0], [0.0, -1.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0]]
-1 -1

One may even start to replace parts of the BASIC source code with functions to replace it, and this way moving the program piece by piece into Python.

Marrin commented 2 years ago

To illustrate the value of a port to QBasic here's the sub routine in line 27 without line numbers and with labels derived from that line numbers:

EvaluateKernel:
  REM ********** KERNEL EVALUATION OF INTEGRALS I2 & I3 **********
  IF K<0 THEN L33
  X3=X2+T*(V1-X2)
  Y3=Y2+T*(V2-Y2)
  Z3=Z2+T*(V3-Z2)
  GOTO L36
L33:
  X3=V1+T*(X2-V1)
  Y3=V2+T*(Y2-V2)
  Z3=V3+T*(Z2-V3)
L36:
  D3=X3*X3+Y3*Y3+Z3*Z3
  REM ----- MOD FOR SMALL RADIUS TO WAVELENGTH RATIO
  IF A(P4)<=SRM THEN D=SQR(D3):GOTO L49
  D=D3+A2
  IF D>0 THEN D=SQR(D)
  REM ----- CRITERIA FOR USING REDUCED KERNEL
  IF I6!=0 THEN L49
  REM ----- EXACT KERNEL CALCULATION WITH ELLIPTIC INTEGRAL
  B=D3/(D3+4*A2)
  W0=C0+B*(C1+B*(C2+B*(C3+B*C4)))
  W1=C5+B*(C6+B*(C7+B*(C8+B*C9)))
  V0=(W0-W1*LOG(B))*SQR(1-B)
  T3=T3+(V0+LOG(D3/(64*A2))/2)/P/A(P4)-1/D
L49:
  B1=D*W
  REM ----- EXP(-J*K*R)/R
  T3=T3+COS(B1)/D
  T4=T4-SIN(B1)/D
  RETURN

And now all the labels within the subroutine replaced by ”structured” code:

EvaluateKernel:
  REM ********** KERNEL EVALUATION OF INTEGRALS I2 & I3 **********
  IF K>=0 THEN
    X3=X2+T*(V1-X2)
    Y3=Y2+T*(V2-Y2)
    Z3=Z2+T*(V3-Z2)
  ELSE
    X3=V1+T*(X2-V1)
    Y3=V2+T*(Y2-V2)
    Z3=V3+T*(Z2-V3)
  END IF
  D3=X3*X3+Y3*Y3+Z3*Z3
  REM ----- MOD FOR SMALL RADIUS TO WAVELENGTH RATIO
  IF A(P4)<=SRM THEN
    D=SQR(D3)
  ELSE
    D=D3+A2
    IF D>0 THEN D=SQR(D)
    REM ----- CRITERIA FOR USING REDUCED KERNEL
    IF I6=0 THEN
      REM ----- EXACT KERNEL CALCULATION WITH ELLIPTIC INTEGRAL
      B=D3/(D3+4*A2)
      W0=C0+B*(C1+B*(C2+B*(C3+B*C4)))
      W1=C5+B*(C6+B*(C7+B*(C8+B*C9)))
      V0=(W0-W1*LOG(B))*SQR(1-B)
      T3=T3+(V0+LOG(D3/(64*A2))/2)/P/A(P4)-1/D
    END IF
  END IF
  B1=D*W
  REM ----- EXP(-J*K*R)/R
  T3=T3+COS(B1)/D
  T4=T4-SIN(B1)/D
  RETURN

schlatterbeck commented 2 years ago

On Tue, Jun 21, 2022 at 09:37:42AM -0700, Marc Rintsch wrote:

Sorry, I can't see any beauty in it and I think it is much more error prone to not convert it into some structured code first without any GOTOs. You'll need to do that in the end anyway.

Hmm, I'm doing just that. But I'm not going to another Basic dialect but to Python. And I don't want to do an intermediate step to yet another language. I need the debugger stuff just to search for bugs in my code (comparing the control flow of the original code to my code). I'm also trying to move all vector computations with multiple variables to a single numpy array. And all variable duos that represent complex numbers to complex number arithmetics.

Consider that (a complex 3d point): 707 X1=0 708 Y1=0 709 Z1=0 710 X2=0 711 Y2=0 712 Z2=0 ... 729 S1=COS(S2) 730 S2=SIN(S2) 731 B1=F3(S1CR(I)-S2CI(I)) 732 B2=F3(S1CI(I)+S2CR(I)) ... 734 X1=X1+KB1CA(L) 735 X2=X2+KB2CA(L) 736 Y1=Y1+KB1CB(L) 737 Y2=Y2+KB2CB(L) 738 Z1=Z1+B1CG(L) 739 Z2=Z2+B2CG(L)

and that: vec = np.zeros (3, dtype = complex) ... kvec2 = np.array ([k, k, 1]) s = np.cos (s2) + 1j np.sin (s2) b = f3 self.current s vec += kvec2 b * wire.dirs

And you immediately see that it's sometimes necessary to verify that code?

And we might want to move the discussion to direct mail: @.***

Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: @.***

robhagemans commented 2 years ago

Fixed with commit 525f9115

robhagemans / pcbasic

Off-by-one error in array indexing #182