dmitriy-serdyuk / kaldi-python

Python wrappers for Kaldi data
Apache License 2.0
34 stars 18 forks source link

Can this convert numpy ndarrays to kaldi format? #13

Closed Miail closed 6 years ago

Miail commented 7 years ago

Is this able to convert numpy.ndarrays into kaldi format matrices?..

It seem the only example given shows that it is able to convert kaldi tables to python...

dmitriy-serdyuk commented 7 years ago

Yes, *MatrixWriter classes can do it. Take a look at an example of usage here.

Miail commented 7 years ago

Hmm.. ah... Nice! This might be a bit of a confusing question.. Should dimension be specified beforehand, or is it able to just take a any given size numpy.ndarray and convert it into kaldi matrix? Or does it handle it by itself.

dmitriy-serdyuk commented 7 years ago

Yes:

In [1]: import kaldi_io
In [2]: import numpy
In [3]: writer = kaldi_io.BaseFloatMatrixWriter("ark:foo.ark")
In [4]: writer.write('one', numpy.zeros((2, 2)))
In [5]: writer.write('two', numpy.zeros((3, 2)))
In [6]: writer.write('three', numpy.zeros((2, 3)))
In [7]: writer.close()
Out[7]: True
In [8]: reader = kaldi_io.SequentialBaseFloatMatrixReader("ark:foo.ark")
In [9]: reader.next()
Out[9]:
('one', array([[ 0.,  0.],
        [ 0.,  0.]], dtype=float32))
In [10]: reader.next()
Out[10]:
('two', array([[ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.]], dtype=float32))
In [11]: reader.next()
Out[11]:
('three', array([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]], dtype=float32))
Miail commented 7 years ago

Soo... I finally got time to look at it. It seem like it has some problems storing big numpy_arrays, so I had to reshape it to 1d data.. What about the .scp file?...

dmitriy-serdyuk commented 7 years ago

It might be a problem that a big array is not contiguous. Try this.

This is just a wrapper around kaldi io, so scp files should work just like in kaldi.

Miail commented 6 years ago

it seems in the output file that is is being stored with 3 columns... rather the number of columns that the number ndarray has.. is there a way I can fix that?

dmitriy-serdyuk commented 6 years ago

Please provide a minimal example reproducing your issue.

Miail commented 6 years ago

It was a code error.. Everything works as it should 👍 thanks for the help!