Open ben-albrecht opened 5 years ago
This came up in the development of #12027.
This sounds more like a(n aliasing) reshape than a reindex to me...? Or, if it was always going to return a 1D array (which it looks like ravel does?) then I might just call it "flatten" or something like that?
Also, I'm noting that ravel does something about the order in which things are linearized, but if you're assuming the same array memory is being used, it seems like we wouldn't need or want to do that (i.e., if you cared about the linearization, you'd specify that in the multidimensional array's layout not in the flattening since anything other than what it was would require moving the data around?). Does that seem right?
This sounds more like a(n aliasing) reshape than a reindex to me...?
It's not clear to me if a reshape-by-reference is a better fit for this than a reshaped-reindex. I don't have a strong preference between the two - I just considered the latter first.
Or, if it was always going to return a 1D array (which it looks like ravel does?) then I might just call it "flatten" or something like that?
We don't necessarily have to limit ourselves to the 1D case like ravel does. I bet there's some use-case where mapping the indices of a 1D array to an ND array is needed.
I personally think it'd be fine for this functionality to be contained with reindex
or reshape
rather than introducing a new helper routine.
While we're using numpy as a design model, I'll mention that there is also a numpy.flatten
which is like ravel
, but always returns a copy of the array - something we can already do with reshape()
today.
if you cared about the linearization, you'd specify that in the multidimensional array's layout not in the flattening since anything other than what it was would require moving the data around?
Right. That's what I meant by this statement:
We could alternatively encode the interpretation of order into the domain map.
.. which seems reasonable to me.
As an addition to this, we might consider providing an argument (like python does) to override the default order (which would be determined from the array layout), e.g. a users wants to linearize as column-major order, but the array is row-major order in memory.
we might consider providing an argument (like python does) to override the default order (which would be determined from the array layout), e.g. a users wants to linearize as column-major order, but the array is row-major order in memory.
Just to make sure I'm understanding: The impact of this would be to shuffle the elements in the original multidimensional array so that we could continue using the same memory (i.e., A[i,j] would become A[j,i] after this call was made for the 2D array in which we changed between CMO and RMO) as opposed to creating an array copy in this case?
Just to make sure I'm understanding: The impact of this would be to shuffle the elements in the original multidimensional array so that we could continue using the same memory (i.e., A[i,j] would become A[j,i] after this call was made for the 2D array in which we changed between CMO and RMO) as opposed to creating an array copy in this case?
I was thinking that the original array's memory would remain untouched, and that the reindex gives a "linearized as if CMO" or "linearized as if RMO" view of the array, depending on the argument. Here's an example:
// Pretend that Chapel supports 2d literals for a moment
var A = [[1,2,3],
[4,5,6]];
// A is RMO
writeln(A[1,2]);
// 2
ref rowA = A.reindex(1..); // defaults to order=Order.Row, due to A's layout being RMO
ref colA = A.reindex(1.., order=Order.Column);
// A is still RMO
writeln(A[1,2]);
// 2
writeln(rowA);
// [1 2 3 4 5 6]
writeln(colA);
// [1 4 2 5 3 6]
Oh, I see... So the expectation is that when indexing into the 1D view of the array, the descriptor resulting from this operation would do math on the index to transform it into a multidimensional index for the original array. I'd been imagining that we would simply create a default rectangular array that aliased the multidimensional array's element buffer and use normal/simple 1D indexing to access them.
Given that clarification, I agree with your original intuition that this is more like reindexing than it is reshaping (since reindexing also does transformations on indices to turn new ones into original ones).
[Unfortunately for you, this type of reindexing also makes me want to run away screaming due to PTSD... :) ]
I bet there's some use-case where mapping the indices of a 1D array to an ND array is needed.
FWIW, in my use cases, I often need to express a chunk of memory (allocated by an array in a user-defined type) as 1D or 2D arrays in different routines, such that:
-- In 1D case, I want to access that array as x( 1 .. Ndof ) (where Ndof = number of degrees of freedom) in a general way, so that this is not a problem/system-specific (i.e., a general N-dimensional problem);
-- At the same time, I want access the same chunk of memory as coord( 1 .. 3, 1.. Natoms ), where Natoms is the number of atoms (or particles) to compute the potential energy conveniently in some routine (e.g., using a pairwise-additive force field).
This is straightforward in Fortran because there is a rank-remapping array pointer (which can create an array view with a different rank from the array pointed). I usually wish to distribute a higher level of objects (e.g., an entire "molecular system" that contains many molecules) rather than the above array itself (e.g., which represents a single molecule inside the entire system), so the above array is intended to be local.
Just as one example case...
I'm looking for functionality in Chapel similar to
numpy.ravel
, where a multi-dimensional array is flattened into a 1D array, which still references the elements of the original multi-dimensional array.It seems like this functionality should fall out of
reindex
, butreindex
currently restricts the new index set to maintain the same rank as the original domain. If this restriction were lifted, it might look something like this:This would be nicer to work with if
reindex
supported unbounded ranges (https://github.com/chapel-lang/chapel/issues/7715) :There is an open question about how to choose the order of the transformation, i.e. row-major vs. column-major.
numpy
handles this with an optional arg, which could work for us as an enum argument:We could alternatively encode the interpretation of order into the domain map. However, supporting an argument in the transformation seems more versatile.
Somewhat related: #9693