lcompilers / lpython

Python compiler
https://lpython.org/
Other
1.37k stars 157 forks source link

Support constant negative indices in the array #2611

Closed hankluo6 closed 3 months ago

hankluo6 commented 3 months ago

Fixed #2503

Test

from lpython import (i16, i32, Const)
from numpy import empty, int16
dim: Const[i32] = 10

def foo():
    """Negative indices produce random results each run."""
    A: i16[dim] = empty((dim,), dtype=int16)
    ww: i32
    for ww in range(dim):
        A[ww] = i16(ww + 1)
    print(A[0], A[1], A[2], "...", A[-3], A[-2], A[-1])

def bar(dim_: i32):
    """Negative indices always produce zero when 'dim' is a parameter."""
    A: i16[dim_] = empty((dim_,), dtype=int16)
    ww: i32
    for ww in range(dim_):
        A[ww] = i16(ww + 1)
    print(A[0], A[1], A[2], "...", A[-3], A[-2], A[-1])

foo()
bar(10)
(lp) hank@MacBook-Pro lpython % ./src/bin/lpython test.py           
1 2 3 ... 8 9 10
1 2 3 ... 8 9 10

I've introduced a new argument to reference the current subscript index as I need it to obtain the corresponding shape of the dimension. Please let me know if there is a better solution for this.

kmr-srbh commented 3 months ago

@hankluo6 Thanks for working on this! I really appreciate it!

I had previously done some research on this issue to understand the depth of the problem, and found that there is more to it than just the negative indices returning incorrect values. Consider this line instead of the one in foo().

print(A[0], A[1], A[20], "...", A[-300], A[-0], A[-1])

The values passed here are invalid. Note the interesting thing - -0. But it compiles.

(base) saurabh-kumar@Awadh:~/Projects/System/lpython$ ./src/bin/lpython ./examples/example.py
1 2 -1896 ... 0 1 10

Things change a bit for large positive indices:

print(A[1000], A[1], A[20], "...", A[-300], A[-0], A[-1])
(base) saurabh-kumar@Awadh:~/Projects/System/lpython$ ./src/bin/lpython ./examples/example.py
0 2 -22504 ... 0 1 10
print(A[10000], A[1], A[20], "...", A[-300], A[-0], A[-1])
(base) saurabh-kumar@Awadh:~/Projects/System/lpython$ ./src/bin/lpython ./examples/example.py
Bus error (core dumped)
(base) saurabh-kumar@Awadh:~/Projects/System/lpython$ ./src/bin/lpython ./examples/example.py
Segmentation fault (core dumped)
kmr-srbh commented 3 months ago

The problem does not end here, as it turns out, this is an issue with how we handle subscript indices. The output is correct when we have only list elements, but when something else like a string literal enters, things start to break. For a normal list:

from lpython import i32, list

A: list[i32] = [1, 2, 4, 5]
print(A[0], A[1], A[2], "...", A[-3], A[-2], A[-1])

The output is the same buggy one:

(lp) saurabh-kumar@Awadh:~/Projects/System/lpython$ ./src/bin/lpython ./examples/example.py
...
kmr-srbh commented 3 months ago

The above testing is done applying the changes made in this PR.

Shaikh-Ubaid commented 3 months ago

The problem does not end here, as it turns out, this is an issue with how we handle subscript indices. The output is correct when we have only list elements, but when something else like a string literal enters, things start to break. For a normal list:

from lpython import i32, list

A: list[i32] = [1, 2, 4, 5]
print(A[0], A[1], A[2], "...", A[-3], A[-2], A[-1])

The output is the same buggy one:

(lp) saurabh-kumar@Awadh:~/Projects/System/lpython$ ./src/bin/lpython ./examples/example.py
...

It seems this PR tackles negative indices in array. I opened a separate issue for the above https://github.com/lcompilers/lpython/issues/2613.

Shaikh-Ubaid commented 3 months ago

Consider this line instead of the one in foo(). print(A[0], A[1], A[20], "...", A[-300], A[-0], A[-1])

These examples do not run with CPython and throw IndexError, so they are not expected to work with LPython. In LPython we can throw a similar error when index passed is not valid, but that is or can be a separate issue.

Shaikh-Ubaid commented 3 months ago

Please mark as "Ready for review" when ready.

hankluo6 commented 3 months ago

@kmr-srbh Thanks for pointing that out. I think this problem can be resolved once we implement the bound check for the array. https://github.com/lcompilers/lpython/blob/0ba09ab5a0e38432be4d839761d0e92beabca94d/src/libasr/codegen/llvm_array_utils.cpp#L569-L571

hankluo6 commented 3 months ago

I've tried using ArrayBound, but it throws an error when combined with IntegerBinOp, so I used ArraySize, which works as expected.

Shaikh-Ubaid commented 3 months ago

We can tackle the above two things in a separate PR.

certik commented 3 months ago

Thanks. I think this PR is fine. No overhead for positive indices and for compile time negative indices it does what the user would have to do anyway if we didn't support this feature, so I think this is fine.