jpy-consortium / jpy

Apache License 2.0
68 stars 16 forks source link

Large arrays create incorrect buffers #143

Closed chipkent closed 2 months ago

chipkent commented 2 months ago

This issue was originally noted in https://github.com/deephaven/deephaven-core/issues/5403. There are many details there.

Reproducer:


import jpy
import numpy as np

i_max = 2_147_483_647

repeat = jpy.get_type("io.deephaven.function.Basic").repeat

def test_it(name, size, jtype, np_type):
    print(f"TEST: {name} || {jtype} {np_type} {size}")

    narray = np.ones(size, dtype=np_type)
    jarray = jpy.array(jtype, narray)

    try:
        buf = narray.tobytes()
        np.frombuffer(buf, dtype=np_type)
        print("\tPASS - np buffer")
    except Exception as e:
        print("\tFAIL - np buffer")
        print(f"\t\tERROR: {e}")

    try:
        np.frombuffer(jarray, dtype=np_type)
        print("\tPASS - jpy buffer")
    except Exception as e:
        print("\tFAIL - jpy buffer")
        print(f"\t\tERROR: {e}")

test_it("Float64 small", i_max//8, "double", np.float64)
test_it("Float64 large", i_max//8 + 1, "double", np.float64)

test_it("Int64 small", i_max//8, "long", np.int64)
test_it("Int64 large", i_max//8 + 1, "long", np.int64)

Output:

TEST: Float64 small || double <class 'numpy.float64'> 268435455
    PASS - np buffer
    PASS - jpy buffer
TEST: Float64 large || double <class 'numpy.float64'> 268435456
    PASS - np buffer
    FAIL - jpy buffer
        ERROR: offset must be non-negative and no greater than buffer length (-2147483648)
TEST: Int64 small || long <class 'numpy.int64'> 268435455
    PASS - np buffer
    PASS - jpy buffer
TEST: Int64 large || long <class 'numpy.int64'> 268435456
    PASS - np buffer
    FAIL - jpy buffer
        ERROR: offset must be non-negative and no greater than buffer length (-2147483648)

The buffer creation yields incorrect array lengths when arrays are longer than MAX_INT/TYPE_SIZE_IN_BYTES. Most likely a type that is too small is being used to compute lengths.

Note that numpy created arrays of the same size appear to function properly.