hltcoe / vaporengine

VaporEngine
3 stars 1 forks source link

Document duration sometimes measured as 0 on Debian 7 system #6

Open charman opened 8 years ago

charman commented 8 years ago

When importing a Corpus using corpus.create_from_zr_output(), the duration of audio documents is sometimes seemingly non-deterministically set to 0 on a Debian 7 system.

Document duration is set using the following code in corpus.create_from_zr_output():

si = pysox.CSoxStream(document.audio_path).get_signal().get_signalinfo()
length_in_seconds = si['length'] / float(audio_rate * audio_channels)
document.duration = int(length_in_seconds * 100)

when importing a Corpus with 624 documents, the total number of documents with a zero-length duration was 351, 356 and 369 across three runs.

The audio files are on the local filesystem (i.e. not NFS-mounted).

This problem has not (yet) been encountered on OS X.