Closed jakobnissen closed 4 years ago
There is a fortran_order
flag built into the numpy data format for column major order, and NPZ.jl
respects that. When you're saving the data, you just need to make sure the array is in fortran order (if that's what you want) and NPZ.jl
will load it in column major order. Here is how you save a fortran order array:
In [1]: x = np.asfortranarray(np.random.randint(10, size=(3,4)))
In [2]: x.flags['F_CONTIGUOUS'], x.flags['C_CONTIGUOUS']
Out[2]: (True, False)
In [3]: np.save("/tmp/x.npy", x)
In [4]: !head -1 /tmp/x.npy
�NUMPYv{'descr': '<i8', 'fortran_order': True, 'shape': (3, 4), }
Note that numpy.load has no special keyword argument to override the fortran_order
flag in the data, so I'm not inclined to add it in NPZ.jl
.
Often, I write a NxF numpy array with N observations and F features in row-major order in order to have the observations in contiguous memory. When loading to Julia, I prefer to load it to a FxN Matrix, such that observations are still contiguous. Currently, this is achievable only by loading in the data (which internally transposes it), then re-transposing it back. This is inefficient. I propose having some kind of keyword "keep_contiguity" or something less terribly-named, which loads the array in as if it was "fortran-contiguous", even if it actually isn't. I.e, given an NxF numpy array in C-contiguous order, return a FxN Matrix. I can pitch a PR if you like.