nttcslab-sp / kaldiio

A pure python module for reading and writing kaldi ark files
Other
248 stars 35 forks source link

Error reading from absolute Windows path #52

Closed horsti371 closed 3 years ago

horsti371 commented 3 years ago

Hi, thank you for your great tool! Unfortunately I have issues reading kaldi matrices / archives from absolute windows pathes, e.g.

my_mat = kaldiio.load_mat(r"C:\temp\my.mat")

The following error occurs:

  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\kaldiio\matio.py", line 232, in load_mat
    ark, offset, slices = _parse_arkpath(ark_name)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\kaldiio\matio.py", line 275, in _parse_arkpath
    offset = int(offset)
ValueError: invalid literal for int() with base 10: '\\temp\\my.mat'

That happens because the absolute path is splitted at ":", to separate the path and offset in kaldi archives. Of course, in this example, "\temp\my_mat" is no valid integer offset. Changing path using os.chdir and using the filename works, but of course this is a bad solution. Is there any workaround for this (or maybe I use it wrong)?

kamo-naoyuki commented 3 years ago

i see, thanks, I didn't care about windows, but it's better to support for usability. I'll think how to do.

for workaround,

from kaldiio.matio import read_kaldi
with open(r"C:\temp\my.mat") as f:
    read_kaldi(f)
horsti371 commented 3 years ago

Thanks, that works, opening binary archives in binary read mode.