prosodylab / Prosodylab-Aligner

Python interface for forced audio alignment using HTK and SoX
http://prosodylab.org/tools/aligner/
MIT License
331 stars 77 forks source link

Using existing phoneme model eng.zip to align .lab and .wav files #71

Closed jamoonie94 closed 4 years ago

jamoonie94 commented 6 years ago

Hey there,

I am trying to align two pairs of words using the existing phoneme model using the command:

python3 -m aligner -r eng.zip -a /Users/Jamoonie/Desktop/IBBME/PRISM/audio-20180126/forced_alignment -d eng.dict

But I am being confronted with the following error:

Traceback (most recent call last): File "/Users/Jamoonie/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/Users/Jamoonie/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/Jamoonie/Documents/Prosodylab-Aligner/aligner/main.py", line 126, in corpus = Corpus(args.align, opts) File "/Users/Jamoonie/Documents/Prosodylab-Aligner/aligner/corpus.py", line 91, in init self._prepare_audio(audiofiles) File "/Users/Jamoonie/Documents/Prosodylab-Aligner/aligner/corpus.py", line 205, in _prepare_audio Fs = WavFile.samplerate(audiofile) File "/Users/Jamoonie/Documents/Prosodylab-Aligner/aligner/wavfile.py", line 48, in samplerate with wave.open(filename, "r") as source: File "/Users/Jamoonie/anaconda3/lib/python3.6/wave.py", line 499, in open return Wave_read(f) File "/Users/Jamoonie/anaconda3/lib/python3.6/wave.py", line 163, in init self.initfp(f) File "/Users/Jamoonie/anaconda3/lib/python3.6/wave.py", line 143, in initfp self._read_fmt_chunk(chunk) File "/Users/Jamoonie/anaconda3/lib/python3.6/wave.py", line 260, in _read_fmt_chunk raise Error('unknown format: %r' % (wFormatTag,)) wave.Error: unknown format: 3

I am unsure what to do. Any help would be much appreciated.

jamoonie94 commented 6 years ago

I actually changed the bitrate of the .wav files to 16 bit and a different error occurred:

Resampling '/Users/Jamoonie/Desktop/IBBME/PRISM/audio-20180126/forced_alignment/BLUE_r150.wav'. Traceback (most recent call last): File "/Users/Jamoonie/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/Users/Jamoonie/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/Jamoonie/Documents/Prosodylab-Aligner/aligner/main.py", line 126, in corpus = Corpus(args.align, opts) File "/Users/Jamoonie/Documents/Prosodylab-Aligner/aligner/corpus.py", line 91, in init self._prepare_audio(audiofiles) File "/Users/Jamoonie/Documents/Prosodylab-Aligner/aligner/corpus.py", line 210, in _prepare_audio w.resample_bang(self.samplerate) File "/Users/Jamoonie/Documents/Prosodylab-Aligner/aligner/wavfile.py", line 79, in resample_bang self.signal = self._resample(Fs_out) File "/Users/Jamoonie/Documents/Prosodylab-Aligner/aligner/wavfile.py", line 72, in _resample resampled_signal = resample(self.signal, ratio * len(self)) File "/Users/Jamoonie/anaconda3/lib/python3.6/site-packages/scipy/signal/signaltools.py", line 1889, in resample Y = zeros(newshape, 'D') TypeError: 'float' object cannot be interpreted as an integer

kylebgorman commented 6 years ago

The first error suggests that Python's wave file reader was unable to parse one or more of your audio files. I would check to make sure that all files labeled *.wav in your forced_alignment directory are in fact wav files first.

Please feel free to post the second error you encountered after changing wav file bitrate.

On Tue, Jun 19, 2018 at 1:05 PM jamoonie94 notifications@github.com wrote:

I actually changed the bitrate of the .wav files to 16 bit and a different error occurred:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/71#issuecomment-398473272, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOQh1BvkQiTytlkHPxQ9SdG_csqyAks5t-S9RgaJpZM4Ut3td .

damiancruse commented 5 years ago

Hi, I am having the same error as above (after they changed bit rate).

The command I'm using is:

python3 -m aligner -r eng.zip -a /Users/crused/Dropbox/MATLAB/Narrative/20k\ test -d eng.dict

...which gives this output:

Resampling '/Users/crused/Dropbox/MATLAB/Narrative/20k test/20000_1.wav'. Traceback (most recent call last): File "/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/crused/Dropbox/MATLAB/Narrative/Prosodylab-Aligner/aligner/main.py", line 126, in corpus = Corpus(args.align, opts) File "/Users/crused/Dropbox/MATLAB/Narrative/Prosodylab-Aligner/aligner/corpus.py", line 91, in init self._prepare_audio(audiofiles) File "/Users/crused/Dropbox/MATLAB/Narrative/Prosodylab-Aligner/aligner/corpus.py", line 210, in _prepare_audio w.resample_bang(self.samplerate) File "/Users/crused/Dropbox/MATLAB/Narrative/Prosodylab-Aligner/aligner/wavfile.py", line 79, in resample_bang self.signal = self._resample(Fs_out) File "/Users/crused/Dropbox/MATLAB/Narrative/Prosodylab-Aligner/aligner/wavfile.py", line 72, in _resample resampled_signal = resample(self.signal, ratio * len(self)) File "/anaconda3/lib/python3.7/site-packages/scipy/signal/signaltools.py", line 2221, in resample Y = zeros(newshape, 'D') TypeError: 'float' object cannot be interpreted as an integer

Please help, thanks!

kylebgorman commented 5 years ago

I believe is an issue with the library used for resampling, which is not mine and which is beyond my control. To avoid it just resample ahead of time using the included script. It’s documented in the README.

On Fri, Mar 22, 2019 at 7:57 AM Damian Cruse notifications@github.com wrote:

Hi, I am having the same error as above (after they changed bit rate).

The command I'm using is:

python3 -m aligner -r eng.zip -a /Users/crused/Dropbox/MATLAB/Narrative/20k\ test -d eng.dict

...which gives this output:

Resampling '/Users/crused/Dropbox/MATLAB/Narrative/20k test/20000_1.wav'. Traceback (most recent call last): File "/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/crused/Dropbox/MATLAB/Narrative/Prosodylab-Aligner/aligner/ main.py", line 126, in corpus = Corpus(args.align, opts) File "/Users/crused/Dropbox/MATLAB/Narrative/Prosodylab-Aligner/aligner/corpus.py", line 91, in init self._prepare_audio(audiofiles) File "/Users/crused/Dropbox/MATLAB/Narrative/Prosodylab-Aligner/aligner/corpus.py", line 210, in _prepare_audio w.resample_bang(self.samplerate) File "/Users/crused/Dropbox/MATLAB/Narrative/Prosodylab-Aligner/aligner/wavfile.py", line 79, in resample_bang self.signal = self._resample(Fs_out) File "/Users/crused/Dropbox/MATLAB/Narrative/Prosodylab-Aligner/aligner/wavfile.py", line 72, in _resample resampled_signal = resample(self.signal, ratio * len(self)) File "/anaconda3/lib/python3.7/site-packages/scipy/signal/signaltools.py", line 2221, in resample Y = zeros(newshape, 'D') TypeError: 'float' object cannot be interpreted as an integer

Please help, thanks!

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/71#issuecomment-475593579, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOZgJ48DG9HuEbC94DBhvF9WkdiWBks5vZMUTgaJpZM4Ut3td .

damiancruse commented 5 years ago

Thanks for the quick reply. I did just that and it solved the problem. Thanks again - very useful piece of software!

iskunk commented 4 years ago

I believe the TypeError: 'float' object cannot be interpreted as an integer error can be fixed with the following edit:

diff --git a/aligner/wavfile.py b/aligner/wavfile.py
index 50e8588..c4b869c 100644
--- a/aligner/wavfile.py
+++ b/aligner/wavfile.py
@@ -69,7 +69,7 @@ class WavFile(object):

     def _resample(self, Fs_out):
         ratio = Fs_out / self.Fs
-        resampled_signal = resample(self.signal, ratio * len(self))
+        resampled_signal = resample(self.signal, int(ratio * len(self)))
         return resampled_signal

     def resample(self, Fs_out):

The second parameter is supposed to be an int.

kylebgorman commented 4 years ago

I see! Do we know for sure it should be floored and not ceilinged or rounded?

PR is welcomed; I’m busy with jury duty this week.

On Mon, Jan 13, 2020 at 1:37 AM Daniel Richard G. notifications@github.com wrote:

I believe the TypeError: 'float' object cannot be interpreted as an integer error can be fixed with the following edit:

diff --git a/aligner/wavfile.py b/aligner/wavfile.py index 50e8588..c4b869c 100644 --- a/aligner/wavfile.py +++ b/aligner/wavfile.py @@ -69,7 +69,7 @@ class WavFile(object):

 def _resample(self, Fs_out):
     ratio = Fs_out / self.Fs
  • resampled_signal = resample(self.signal, ratio * len(self))
  • resampled_signal = resample(self.signal, int(ratio * len(self))) return resampled_signal

    def resample(self, Fs_out):

The second parameter is supposed https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.resample.html to be an int.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/71?email_source=notifications&email_token=AABG4ONGGNCWXPLTY5SK673Q5QD2NA5CNFSM4FFXPNO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIXVKXQ#issuecomment-573527390, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG4OIJITJSDJLZDRGQBB3Q5QD2NANCNFSM4FFXPNOQ .

iskunk commented 4 years ago

You tell me :-) I was only concerned with the argument type.

(I would guess rounding is preferable, to minimize error)

kylebgorman commented 4 years ago

I usually assume flooring in signal processing, let’s try that first?

On Mon, Jan 13, 2020 at 2:38 PM Daniel Richard G. notifications@github.com wrote:

You tell me :-) I was only concerned with the argument type.

(I would guess rounding is preferable, to minimize error)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/71?email_source=notifications&email_token=AABG4OJ4BYGGRNFM5W46VJLQ5S7M7A5CNFSM4FFXPNO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI2AJNI#issuecomment-573834421, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG4OOMJEUS2N7TP2RF423Q5S7M7ANCNFSM4FFXPNOQ .

iskunk commented 4 years ago

I mean, it works either way, not least because the difference is literally one sample. The question is really which way is more (theoretically) correct.

Anyway, int() is effectively floor'ing in this case. Should I submit this as a PR?

kylebgorman commented 4 years ago

Let's try that, yes please.

On Mon, Jan 13, 2020 at 5:31 PM Daniel Richard G. notifications@github.com wrote:

I mean, it works either way, not least because the difference is literally one sample. The question is really which way is more (theoretically) correct.

Anyway, int() is effectively floor'ing in this case. Should I submit this as a PR?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/71?email_source=notifications&email_token=AABG4OOLCIROUVRC26PUA2DQ5TTVXA5CNFSM4FFXPNO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI2RK3A#issuecomment-573904236, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG4ONVZOONA3ZLYPZBXYLQ5TTVXANCNFSM4FFXPNOQ .

iskunk commented 4 years ago

All right, I've submitted PR #77 for this.