Closed hameerabbasi closed 5 years ago
Isn't this the goal of uarray
?
The issue was that indexing a fixed dimension array usually produces a view, but that of a variable dimension array recomputes the offsets under some conditions. This was a proposed solution.
To me that looks exactly like the existing implementation:
There's a slice stack that is applied on access, I guess that's what you mean by lazy
. I'm currently adding indices to that because @dhirschfeld has a use case.
Is applying a slice stack better for cases when you don’t need complete slices?
You need the stack because multiple slices can be applied on top of each other.
y = x[2:10] # Push first slice onto the stack.
z = y[20:30] # Push second slice onto the stack.
z[3] # Apply each slice of the stack and return the element.
You always have to calculate start
, stop
, step
even if just to see that an index is actually valid.
Can you do the following though:
y = x[1:30, 4:50]
z = y[2:10, 4:20]
z[5, 6]
Yes, of course:
>>> x = xnd([[1,2,3], [4,5,6,7], [8,9,10,11,12]])
>>> y = x[1:2, 0:3]
>>> z = y[:, 0:2]
>>> z[0, 1]
xnd(5, type='int64')
To others who are reading this: We are talking about ragged arrays, fixed arrays (like NumPy's ndarray
or XND's fixed dimension
) do not require this additional code.
The fundamental difference between the slice stack and the XNDLazy
object seems to me that for multiple slices on top of each other one would need multiple XNDLazy
types:
var * lazy * lazy * var * int64
This messes up type->ndim
and complicates the code. The slice stack is attached to the variable dimensions with the correct type->ndim
.
For indices the story is a bit different: Yesterday I added an Index
type (privately) to implement @dhirschfeld's request and also address Eric Wieser's comments about mixed indexing and slicing.
But there can be at most one Index
type in a row, because that dimension is eliminated. Even then, it's a bit of a hassle to adjust type->ndim
, because now we have a shadow dimension.
Closing, this works now:
>>> x = xnd([[0], [1,2], [3,4,5]])
>>> x[::-1, 0]
xnd([3, 1, 0], type='var * int64')
The internal representation is the existing slice stack together with a new VarDimElem
type that stores the index for eliminated dimensions that are preceded by a slice.
It might be nice to have a lazy XND container (particularly for indexing) instead of an eager one. My idea is the following, given and xnd array:
When we do the following:
y = x[2:4, 1:2]
, we should index with the first part immediately but store the second slice as belonging to axis 1 in something like anXNDLazy
object. Then, when we do something like the following:y[1, 0]
should "dynamically" realise that we're operating on a lazy object and returnxnd(11, type='int64')
, applying the slice when the element is needed or accessed.