v923z / micropython-ulab

a numpy-like fast vector module for micropython, circuitpython, and their derivatives
https://micropython-ulab.readthedocs.io/en/latest
MIT License
432 stars 117 forks source link

[BUG] Crash when allocating extremely large array #576

Closed jepler closed 1 year ago

jepler commented 1 year ago

Describe the bug This was a case of "huh, I wonder" rather than something that actually affects a program I want to run. These behaviors are specifically on the 64-bit unix port, different numbers would provoke similar behaviors on 32-bit embedded arm builds.

MicroPython v1.18-169-g665f0e2a6 on 2022-06-15; linux [GCC 10.2.1] version
>>> import ulab
>>> ulab.__version__
'5.0.7-2D-c'
>>> ulab.numpy.ones((1<<31, 1<<31))
Segmentation fault

Expected behavior MemoryError, as it's infeasible to allocate an array with 2**62 elements

Additional context Probably arithmetic on array size overflows the value type it's performed on, a common problem in C programs. Another case that gives an odd error (the actual allocation of 0x20000001800000028 [66 bits] probably gets turned into a requested allocation of 0x1800000028 [64 bits] which fails, but the failure only prints the value 0x28 [32 bits]):

>>> a = 2147483653
>>> b = 2147483649
>>> np.ones((a,b))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
MemoryError: memory allocation failed, allocating 40 bytes

The difference may be that not all the low 64 bits of the size are zero in this case. different behaviors would be seen on 32-bit platforms, etc.

Another case where allocation of 0x20000000000000028 bytes probably gets turned into an erroneously-successful allocation of 40 bytes:

>>> np.ones((194899806, 189294637612))
array([[1.0, 1.0, 1.0, ..., Segmentation fault
v923z commented 1 year ago

I think the culprit is the ndarray_new_ndarray function here: https://github.com/v923z/micropython-ulab/blob/6fcfeda58da8632bb7774858a9bf974afe65d5dd/code/ndarray.c#L612-L616 which assigns to the len member of the ndarray structure: https://github.com/v923z/micropython-ulab/blob/6fcfeda58da8632bb7774858a9bf974afe65d5dd/code/ndarray.h#L141-L152 That is declared to be size_t. Unless we change that, we're going to run into the issue that you raised here.

However, I wonder, whether this is an problem on MCUs. I see that more an more people use micropython, and ulab on PCs, but do you expect someone to try to allocated GBs of RAM for an application on an MCU?

What we could still do is check the length in the ndarray_new_ndarray function, and gracefully bail out, if that is large.