v923z / micropython-ulab

a numpy-like fast vector module for micropython, circuitpython, and their derivatives
https://micropython-ulab.readthedocs.io/en/latest
MIT License
403 stars 111 forks source link

[BUG] Crash in array creation with non-matching ranges #584

Open jepler opened 1 year ago

jepler commented 1 year ago

Describe the bug A clear and concise description of what the bug is. Give the ulab version

MicroPython v1.19.1-837-g67fac4ebc on 2023-01-24; linux [GCC 4.2.1] version
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> import ulab
>>> ulab.__version__
'6.0.5-2D-c'

To Reproduce

a = np.array([range(0, 3),range(2**16, 2**26)])

micropython will crash (segmentation fault in this case)

Expected behavior An exception is raised. Here's the error that numpy raises, for reference:

>>> a = np.array([range(0, 3),range(2**16, 2**26)])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

Additional context Found by fuzzing, not from real code

v923z commented 1 year ago

I'm not quite sure I see what the problem is. On the one hand,

>>> np.array([range(0, 3), range(0, 5)])
<stdin>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences 
(which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. 
If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
array([range(0, 3), range(0, 5)], dtype=object)

while on the other,

a = np.array([range(0, 3),range(2**16, 2**26)])

is simply killed in my case:

>>> np.array([range(0, 3), range(2**16, 2**26)])
Killed
jepler commented 1 year ago

np.array([range(0, 3), range(0, 5)]) is probably also misbehaving under ulab, it's just not leading to an immediate crash. For instance, in this session something goes badly wrong subsequent to the bad array constructor call, though it's not entirely clear what: (I think it's due to the second, larger range() leading to stores at inappropriate locations in the heap)

MicroPython v1.19.1-837-g67fac4ebc on 2023-01-24; linux [GCC 4.2.1] version
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> import gc
>>> from ulab import numpy as np
>>> gc.collect()
21
>>> l1 = [None] * 1024
>>> l2 = [None] * 1024
>>> del l1
>>> gc.collect()
38
>>> np.array([range(3), range(4096)])
array([[0.0, 1.0, 2.0],
       [0.0, 1.0, 2.0]], dtype=float64)
>>> l2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'l2' isn't defined
>>> dir()
[Segmentation fault]
jepler commented 1 year ago

The second range is big enough that it exceeds the micropython heap and accessing beyond it has the happy side effect of crashing promptly.

v923z commented 1 year ago

I tried to run your example in numpy, that's where it was killed.

v923z commented 1 year ago
MicroPython v1.19.1-837-g67fac4ebc on 2023-01-24; linux [GCC 4.2.1] version
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> import gc
>>> from ulab import numpy as np
>>> gc.collect()
21
>>> l1 = [None] * 1024
>>> l2 = [None] * 1024
>>> del l1
>>> gc.collect()
38
>>> np.array([range(3), range(4096)])
array([[0.0, 1.0, 2.0],
       [0.0, 1.0, 2.0]], dtype=float64)

This is clearly wrong. Interestingly, it doesn't really matter what range you specify in the second range, the result is always a 3x2 matrix.