[Feature Request] [stdlib] [proposal] Add `Array` type

martinvuyk commented 1 month ago

Review Mojo's priorities

[X] I have read the roadmap and priorities and I believe this request falls within the priorities.

What is your request?

Closes #2804

Since the proposal to rename InlineList to Array fell through (#2773), I thought of adding a dynamic Array that follows most of Python's behavior but lets the developer decide if and how much small buffer optimization to use for DType subtypes.

The current implementation has basically become a user friendly wrapper around SIMD.

The current implementation lets the programmer decide what amount of "capacity" to allocate for the Array. Under the hood it rounds up to the next upper power of two for the underlying SIMD. Though every method has then to take that delta into account (everything is parametrized).

Every operation on the Array is done in a vectorized manner except extending, appending, concatenating, etc. So using array.__contains__ can be a lot faster depending on the hardware it's running on.

You then have calculations like the dot product, cosine between arrays, applying a function to the array, and many other future ease of use features that can be added that vectorize the ops wherever possible.

Arrays of different lengths can interact with each other in many methods where it makes sense (concatenation, appending values from other to self, etc.).

Another important aspect for the future is the ease of use for going List[T] <-> Array[T] taking SBO into account.

It would also be awesome to add some benchmarks.

What is your motivation for this change?

Having stack allocated Arrays for numeric types.

Any other details?

No response

JoeLoser commented 1 month ago

Can you help me understand more about what you mean? Reg-passable types aren't going to be a thing controlled as they are now. When we adopt small buffer optimization to list, it's reasonable for it to just be a parameter (just like how the SSO size could be controlled at the String type parameter level). In which case, there won't be anything concrete to do here for this once all of Gabriel's SSO/SBO work lands.

martinvuyk commented 1 month ago

Sorry I'll try to explain a good use case:

Say you have a char field and you know it's going to be mostly 48 byte strings, so classic 32 byte SBO for string doesn't cut it. You want an array that can handle custom and flexible SBO.

Say you then look at List's comming SBO that lets you add a buffer and when the item extends that buffer beyond capacity you have to deal with two different pointers, one for the SB and the other for the heap. If you have 3 typical lenghts (e.g. 48, 24, 16 bytes) you are dealing with, you will have to choose the biggest size for SBO and eat up the memory cost.

Following the example of having 48, 32, 24, 16 possiblities, say 25% for each. If you had a structure (the proposed Array) that has 5 transitions i.e. Array[UInt8, 16, 8, 48] for which each item is sized with the smallest fitting capacity, you wouldn't waste quite so much memory.

Another thing which is why the original Idea came up, is that there is no clear way to go from a List[Scalar] (other than iterating and loading, and knowing about its existence) to a SIMD vector. If Array could shift to internally using SIMD instead of the current inlineArray and List as heap, and be a user friendly interface for SIMD and vectorized ops.. that would be even better. Is there something like it already in the works? If so I'll just close the PR np

modularml / mojo