nttcslab-sp / kaldiio

A pure python module for reading and writing kaldi ark files
Other
248 stars 35 forks source link

kaldiio.load_mat to contiguous numpy array #49

Closed songtaoshi closed 4 years ago

songtaoshi commented 4 years ago

Hi, I am encountering a problem about the numpy array loaded by the kaldiio.load_mat function.

The loaded numpy array is not contiguous. Are there any option parameters to load to be contiguous array.

kamo-naoyuki commented 4 years ago

Kaldi matrix file(=.mat file in Kaldi) is always c-contiguous. How do you use kaldiio?

songtaoshi commented 4 years ago

image

Hi, The numpy array loaded by the kaldiio.load_mat is Fortran contiguous arrays rather than C contiguous array. While lots of library used in deep learning are compiled from C++/C.

Are there any solutions? I have read the source seems to be in kaldiio.utils open_like_kaldi function. But I am not that familiar with the area.

Thanks for your reply!

kamo-naoyuki commented 4 years ago

Oh, I forgot that there are 4 types of Matrix format in Kaldi, not compressed matrix, CM, CM2, and CM3. CM2 and CM3 have f-contiguous format.

>>> kaldiio.save_mat("a.mat", np.ones((3,3)), compression_method=2)
>>> kaldiio.load_mat("a.mat").flags
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

F-fortran binary format can't be loaded as c-contiguous. You can reorder it by normal numpy way. Of course, it takes reordering costs.

np.asarray(array, order='C')
songtaoshi commented 4 years ago

Thanks for your reply, so it depends the format of the loaded mat, or, We have to reorder it.