srvk / DiViMe

ACLEW Diarization Virtual Machine
Apache License 2.0
32 stars 9 forks source link

Noisemes SAD fails due to inability to read HTK file #129

Closed MedericCar closed 5 years ago

MedericCar commented 5 years ago

I tried running noisemes Sad on a single file and got this error:

wavs and transcriptions found !
Tests finished
extracting features for speech activity detection
Extracting features for BBC_0004_000060_000180.wav ...
(MSG) [2] in SMILExtract : openSMILE starting!
(MSG) [2] in SMILExtract : config file is: MED_2s_100ms_htk.conf
(MSG) [2] in cComponentManager : successfully registered 96 component types.
(MSG) [2] in cComponentManager : successfully finished createInstances
                                 (19 component instances were finalised, 1 data memories were finalised)
(MSG) [2] in cComponentManager : starting single thread processing loop
(MSG) [2] in cComponentManager : Processing finished! System ran for 12023 ticks.
DONE!
detecting speech and non speech segments
Traceback (most recent call last):
  File "yunified.py", line 162, in <module>
    for feat in readHtk(os.path.join(INPUT_DIR,file), HTK_CHUNKSIZE, preSamples):
  File "G/coconut/fileutils/htk.py", line 49, in readHtk
    data = struct.unpack(">%df" % (chunk_size * sampSize / 4), f.read(chunk_size * sampSize))
struct.error: unpack requires a buffer of 33345000 bytes
finished detecting speech and non speech segments
ls: cannot access /vagrant/data/small_test/hyp_sum/*.lab: No such file or directory

I didn't get this error before but I had not rebuilt the VM for 3 weeks. It appears the file causing the problem has just been changed in srvk/Yunitator's latest commit, maybe this is linked.

fmetze commented 5 years ago

Can you try with the most recent version? If you still see this issue, please report your exact command line and make the file available.

MedericCar commented 5 years ago

It works now, thanks!