scikit-hep / awkward

Manipulate JSON-like data with NumPy-like idioms.
https://awkward-array.org
BSD 3-Clause "New" or "Revised" License
806 stars 81 forks source link

Long time to error on incompatible shapes in numpy-broadcasting #3129

Open nsmith- opened 1 month ago

nsmith- commented 1 month ago

Version of Awkward Array

2.6.4

Description and code to reproduce

When attempting to incorrectly broadcast regular arrays (which follow right-justified shape broadcast semantics as opposed to left-justified for ragged arrays), I get an error as I should

import awkward as ak
import numpy as np

n = 10
a = ak.zip({"a": np.ones((n, 3))}, depth_limit=1)

a.a * np.ones(n) # raises ValueError: cannot broadcast RegularArray of size 3 with RegularArray of size 10 in multiply

Great! But there is some very non-linear scaling with how long it takes to raise this error with n:

import time
import awkward as ak
import numpy as np

for n in np.geomspace(1000, 100_000, 10).astype(int):
    a = ak.zip({"a": np.ones((n, 3))}, depth_limit=1)
    tic = time.monotonic()
    try:
        a.a * np.ones(n)
    except ValueError:
        pass
    toc = time.monotonic()
    print(f"Took {toc-tic:.4f}s for {n=}")

produces

Took 0.0022s for n=1000
Took 0.0102s for n=1668
Took 0.0142s for n=2782
Took 0.0401s for n=4641
Took 0.1293s for n=7742
Took 0.3004s for n=12915
Took 0.7906s for n=21544
Took 2.2169s for n=35938
Took 9.6671s for n=59948
Took 58.6853s for n=100000