Closed jbms closed 1 year ago
To me this sounds like a new codec called permutedims
which is reminiscent of a julia function of the same name. There's also a MATLAB function permute
and a R
function aperm
.
From your example, it does seem like that use of multiple transpose
codecs in a row should not be permitted and perhaps that it should only be applied first.
To me this sounds like a new codec called
permutedims
which is reminiscent of a julia function of the same name. There's also a MATLAB functionpermute
and aR
functionaperm
.From your example, it does seem like that use of multiple
transpose
codecs in a row should not be permitted and perhaps that it should only be applied first.
To be clear, the existing spec already allows an explicit permutation, I am just proposing to not allow "C" and "F" as aliases.
I would agree that permutedims would be a fine name, but happened to choose transpose to match the numpy function.
While applying the transpose codec multiple times consecutively could always be equivalently represented as a single application of transpose with the composed permutation, and implementations might indeed wish to perform that optimization, I don't think it would be helpful to actually disallow specifying transpose multiple times; that might reasonably occur if an automated tool modifies the metadata and doesn't handle composing the transposes itself. In any case even if multiple consecutive transposes are disallowed the issue can still occur if there are other codecs in between, such as delta encoding, sharding, etc.
After giving it some thought, I agree to this change. In fact, I also had some headaches to make F
and C
aliases work in zarrita. Allowing just the array notation would make it clearer. I'm fine with sticking with transpose
as a name for the codec.
The "C" and "F" constants were added to emphasize the relationship with the zarr v2
order
metadata field. However, these contants are somewhat confusing:"C" means no-op and therefore never needs to be used
"F" means reverse dimension order. If there are no other changes to the dimesnion order, this does indeed result in Fortran-order storage, but if there are other changes to the dimension order, that is not the case. For example, in the following:
or
the storage order is actually "F", while in the following:
the storage order is actually "C".