prosodylab / Prosodylab-Aligner

Python interface for forced audio alignment using HTK and SoX
http://prosodylab.org/tools/aligner/
MIT License
331 stars 77 forks source link

Aligner trying to make overlapping intervals #59

Closed Kedersha closed 7 years ago

Kedersha commented 7 years ago

Hi Kyle! My friend @mariannehuijsmans is using the aligner for data from Ayajuthem (a Coast Salish language), and we've got it working up to a certain point. We've succeeded in creating aya-mod.zip and have ourselves a big directory of data to be aligned. However, when we go to align this data, we consistently get the following (verbose) output:

Reading aligner from '/Users/mhuijs/Dropbox/QP2/Prosodylab-Aligner/aya-mod.zip'.
Preparing corpus '/Users/mhuijs/Dropbox/QP2/audiofiles'.
Aligning corpus '/Users/mhuijs/Dropbox/QP2/audiofiles'.
Writing TextGrids.
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/mhuijs/Dropbox/QP2/Prosodylab-Aligner/aligner/__main__.py", line 134, in <module>
    size = MLF(aligned).write(args.align)
  File "/usr/local/lib/python3.6/site-packages/textgrid/textgrid.py", line 722, in __init__
    self.read(f, samplerate)
  File "/usr/local/lib/python3.6/site-packages/textgrid/textgrid.py", line 763, in read
    phon.add(pmin, pmax, line[2])
  File "/usr/local/lib/python3.6/site-packages/textgrid/textgrid.py", line 408, in add
    self.addInterval(Interval(minTime, maxTime, mark))
  File "/usr/local/lib/python3.6/site-packages/textgrid/textgrid.py", line 416, in addInterval
    i = bisect_left(self.intervals, interval)
  File "/usr/local/lib/python3.6/site-packages/textgrid/textgrid.py", line 193, in __lt__
    raise(ValueError(self, other))
ValueError: (Interval(5.8300001, 5.9000001, N), Interval(5.9, 5.97, T))

It looks like it's mad about trying to make overlapping intervals, so we tried adjusting the DEFAULT_PRECISION variable in textgrid.py, but that didn't help at all. Any advice? (It's not just one or two problem wav files causing this error, it seems to be all of them.)

kylebgorman commented 7 years ago

Oh right, this bug again, really annoying. I assume it doesn't work if you set DEFAULT_PRECISION to be, like, 4?

If this is still happening, I suppose we took the wrong approach the last time. Instead, we either want to use exact decimals or make the overlap checker weaker; the latter is simpler.

Here is a quick hack that should work in the meantime. Find wherever textgrid.py lives (It'll be in one of the directories in import sys; print sys.path). Replace lines 256-257 with:

return True

(I.e, make it so the overlap test always passes.)

On Tue, Apr 11, 2017 at 1:21 AM, Kedersha notifications@github.com wrote:

Hi Kyle! My friend @mariannehuijsmans https://github.com/mariannehuijsmans is using the aligner for data from Ayajuthem (a Coast Salish language), and we've got it working up to a certain point. We've succeeded in creating aya-mod.zip and have ourselves a big directory of data to be aligned. However, when we go to align this data, we consistently get the following (verbose) output:

Reading aligner from '/Users/mhuijs/Dropbox/QP2/Prosodylab-Aligner/aya-mod.zip'. Preparing corpus '/Users/mhuijs/Dropbox/QP2/audiofiles'. Aligning corpus '/Users/mhuijs/Dropbox/QP2/audiofiles'. Writing TextGrids. Traceback (most recent call last): File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/mhuijs/Dropbox/QP2/Prosodylab-Aligner/aligner/main.py", line 134, in size = MLF(aligned).write(args.align) File "/usr/local/lib/python3.6/site-packages/textgrid/textgrid.py", line 722, in init self.read(f, samplerate) File "/usr/local/lib/python3.6/site-packages/textgrid/textgrid.py", line 763, in read phon.add(pmin, pmax, line[2]) File "/usr/local/lib/python3.6/site-packages/textgrid/textgrid.py", line 408, in add self.addInterval(Interval(minTime, maxTime, mark)) File "/usr/local/lib/python3.6/site-packages/textgrid/textgrid.py", line 416, in addInterval i = bisect_left(self.intervals, interval) File "/usr/local/lib/python3.6/site-packages/textgrid/textgrid.py", line 193, in lt raise(ValueError(self, other)) ValueError: (Interval(5.8300001, 5.9000001, N), Interval(5.9, 5.97, T))

It looks like it's mad about trying to make overlapping intervals, so we tried adjusting the DEFAULT_PRECISION variable in textgrid.py, but that didn't help at all. Any advice? (It's not just one or two problem wav files causing this error, it seems to be all of them.)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/59, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOeIx2g2Q3bZ1IQpiKXAfwXtUEQQrks5ruw3wgaJpZM4M5odD .

Kedersha commented 7 years ago

Yeah, DEFAULT_PRECISION didn't help us anywhere in the range between 5-50.

Thanks so much for the quick response! @mariannehuijsmans what happens when you try this?

mariannehuijsmans commented 7 years ago

Yes, thanks you so much for the quick response! When I went to edit textgrid.py this afternoon, I realized that I have textgrid.py saved in more than one location on my computer. I think we must have been editing one that was saved in a different location last night (not the one that the aligner was using), I tried changing DEFAULT_PRECISION to 4 in the textgrid.py file that the aligner was using (/usr/local/lib/python3.6/site-packages/textgrid/textgrid.py) and the aligner ran! Then I tried it with DEFAULT_PRECISION set at 8 and I got the same error mesage. At 6, the aligner still ran.

Marianne

On Apr 11, 2017, at 10:37 AM, Kedersha notifications@github.com wrote:

Yeah, DEFAULT_PRECISION didn't help us anywhere in the range between 5-50.

Thanks so much for the quick response! @mariannehuijsmans https://github.com/mariannehuijsmans what happens when you try this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/59#issuecomment-293339896, or mute the thread https://github.com/notifications/unsubscribe-auth/AQRPaGBEbzW9O0HWPtiDjrfwWIjomTPTks5ru7pngaJpZM4M5odD.

kylebgorman commented 7 years ago

Alright: trying something here. I just updated the TextGrid library on GitHub so that MLF precision is, by default, just 5 digits---MLFs are what the aligner spits out and TextGrids are then constructed from them. TextGrid precision itself remains higher by default---15---so this ought not to affect any users not using Prosodylab-Aligner or HTK features.

If you could, please destroy every other copy of textgrid.py, grab the module from GitHub, and confirm that takes care of your issue.

Then, if it does, I'll upload the revised textgrid to PyPi so everyone can benefit.

If it doesn't work, let's start a new issue over at the TextGrid GitHub repo (github.com/kylebgorman/textgrid) and I'll take it from there.

On Tue, Apr 11, 2017 at 6:57 PM, Marianne Huijsmans < notifications@github.com> wrote:

Yes, thanks you so much for the quick response! When I went to edit textgrid.py this afternoon, I realized that I have textgrid.py saved in more than one location on my computer. I think we must have been editing one that was saved in a different location last night (not the one that the aligner was using), I tried changing DEFAULT_PRECISION to 4 in the textgrid.py file that the aligner was using (/usr/local/lib/python3.6/ site-packages/textgrid/textgrid.py) and the aligner ran! Then I tried it with DEFAULT_PRECISION set at 8 and I got the same error mesage. At 6, the aligner still ran.

Marianne

On Apr 11, 2017, at 10:37 AM, Kedersha notifications@github.com wrote:

Yeah, DEFAULT_PRECISION didn't help us anywhere in the range between 5-50.

Thanks so much for the quick response! @mariannehuijsmans < https://github.com/mariannehuijsmans> what happens when you try this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ prosodylab/Prosodylab-Aligner/issues/59#issuecomment-293339896, or mute the thread https://github.com/notifications/unsubscribe-auth/ AQRPaGBEbzW9O0HWPtiDjrfwWIjomTPTks5ru7pngaJpZM4M5odD.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/59#issuecomment-293423949, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOZoNF9mN8JAuf-yHJ5CxHWIqn04Zks5rvAVBgaJpZM4M5odD .

mariannehuijsmans commented 7 years ago

Thanks Kyle! Yes, I ran it with a freshly installed version of textgrid and it worked!

Marianne

On Apr 11, 2017, at 4:10 PM, Kyle Gorman notifications@github.com wrote:

Alright: trying something here. I just updated the TextGrid library on GitHub so that MLF precision is, by default, just 5 digits---MLFs are what the aligner spits out and TextGrids are then constructed from them. TextGrid precision itself remains higher by default---15---so this ought not to affect any users not using Prosodylab-Aligner or HTK features.

If you could, please destroy every other copy of textgrid.py, grab the module from GitHub, and confirm that takes care of your issue.

Then, if it does, I'll upload the revised textgrid to PyPi so everyone can benefit.

If it doesn't work, let's start a new issue over at the TextGrid GitHub repo (github.com/kylebgorman/textgrid) and I'll take it from there.

On Tue, Apr 11, 2017 at 6:57 PM, Marianne Huijsmans < notifications@github.com> wrote:

Yes, thanks you so much for the quick response! When I went to edit textgrid.py this afternoon, I realized that I have textgrid.py saved in more than one location on my computer. I think we must have been editing one that was saved in a different location last night (not the one that the aligner was using), I tried changing DEFAULT_PRECISION to 4 in the textgrid.py file that the aligner was using (/usr/local/lib/python3.6/ site-packages/textgrid/textgrid.py) and the aligner ran! Then I tried it with DEFAULT_PRECISION set at 8 and I got the same error mesage. At 6, the aligner still ran.

Marianne

On Apr 11, 2017, at 10:37 AM, Kedersha notifications@github.com wrote:

Yeah, DEFAULT_PRECISION didn't help us anywhere in the range between 5-50.

Thanks so much for the quick response! @mariannehuijsmans < https://github.com/mariannehuijsmans> what happens when you try this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ prosodylab/Prosodylab-Aligner/issues/59#issuecomment-293339896, or mute the thread https://github.com/notifications/unsubscribe-auth/ AQRPaGBEbzW9O0HWPtiDjrfwWIjomTPTks5ru7pngaJpZM4M5odD.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/59#issuecomment-293423949, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOZoNF9mN8JAuf-yHJ5CxHWIqn04Zks5rvAVBgaJpZM4M5odD .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/59#issuecomment-293426341, or mute the thread https://github.com/notifications/unsubscribe-auth/AQRPaHCdib01lnQkGOPhLPZqUPOCbNG6ks5rvAhygaJpZM4M5odD.

kylebgorman commented 7 years ago

Hi Michael, any chance you could update the textgrid.py in on PyPi to match HEAD at GitHub? It looks like you were the last person to upload this---it looks like you're the editor of that repo on PyPi?

On Tue, Apr 11, 2017 at 7:34 PM, Marianne Huijsmans < notifications@github.com> wrote:

Thanks Kyle! Yes, I ran it with a freshly installed version of textgrid and it worked!

Marianne

On Apr 11, 2017, at 4:10 PM, Kyle Gorman notifications@github.com wrote:

Alright: trying something here. I just updated the TextGrid library on GitHub so that MLF precision is, by default, just 5 digits---MLFs are what the aligner spits out and TextGrids are then constructed from them. TextGrid precision itself remains higher by default---15---so this ought not to affect any users not using Prosodylab-Aligner or HTK features.

If you could, please destroy every other copy of textgrid.py, grab the module from GitHub, and confirm that takes care of your issue.

Then, if it does, I'll upload the revised textgrid to PyPi so everyone can benefit.

If it doesn't work, let's start a new issue over at the TextGrid GitHub repo (github.com/kylebgorman/textgrid) and I'll take it from there.

On Tue, Apr 11, 2017 at 6:57 PM, Marianne Huijsmans < notifications@github.com> wrote:

Yes, thanks you so much for the quick response! When I went to edit textgrid.py this afternoon, I realized that I have textgrid.py saved in more than one location on my computer. I think we must have been editing one that was saved in a different location last night (not the one that the aligner was using), I tried changing DEFAULT_PRECISION to 4 in the textgrid.py file that the aligner was using (/usr/local/lib/python3.6/ site-packages/textgrid/textgrid.py) and the aligner ran! Then I tried it with DEFAULT_PRECISION set at 8 and I got the same error mesage. At 6, the aligner still ran.

Marianne

On Apr 11, 2017, at 10:37 AM, Kedersha notifications@github.com wrote:

Yeah, DEFAULT_PRECISION didn't help us anywhere in the range between 5-50.

Thanks so much for the quick response! @mariannehuijsmans < https://github.com/mariannehuijsmans> what happens when you try this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ prosodylab/Prosodylab-Aligner/issues/59#issuecomment-293339896, or mute the thread https://github.com/notifications/unsubscribe-auth/ AQRPaGBEbzW9O0HWPtiDjrfwWIjomTPTks5ru7pngaJpZM4M5odD.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/ issues/59#issuecomment-293423949, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOZoNF9mN8JAuf- yHJ5CxHWIqn04Zks5rvAVBgaJpZM4M5odD .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ prosodylab/Prosodylab-Aligner/issues/59#issuecomment-293426341, or mute the thread https://github.com/notifications/unsubscribe-auth/ AQRPaHCdib01lnQkGOPhLPZqUPOCbNG6ks5rvAhygaJpZM4M5odD.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/59#issuecomment-293430243, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuORtFZnBtBk_tpA5_AM5Vcsx5_F8Oks5rvA3tgaJpZM4M5odD .

kylebgorman commented 7 years ago

PS, please bump the version number too.

On Tue, Apr 11, 2017 at 7:46 PM, Kyle Gorman kylebgorman@gmail.com wrote:

Hi Michael, any chance you could update the textgrid.py in on PyPi to match HEAD at GitHub? It looks like you were the last person to upload this---it looks like you're the editor of that repo on PyPi?

On Tue, Apr 11, 2017 at 7:34 PM, Marianne Huijsmans < notifications@github.com> wrote:

Thanks Kyle! Yes, I ran it with a freshly installed version of textgrid and it worked!

Marianne

On Apr 11, 2017, at 4:10 PM, Kyle Gorman notifications@github.com wrote:

Alright: trying something here. I just updated the TextGrid library on GitHub so that MLF precision is, by default, just 5 digits---MLFs are what the aligner spits out and TextGrids are then constructed from them. TextGrid precision itself remains higher by default---15---so this ought not to affect any users not using Prosodylab-Aligner or HTK features.

If you could, please destroy every other copy of textgrid.py, grab the module from GitHub, and confirm that takes care of your issue.

Then, if it does, I'll upload the revised textgrid to PyPi so everyone can benefit.

If it doesn't work, let's start a new issue over at the TextGrid GitHub repo (github.com/kylebgorman/textgrid) and I'll take it from there.

On Tue, Apr 11, 2017 at 6:57 PM, Marianne Huijsmans < notifications@github.com> wrote:

Yes, thanks you so much for the quick response! When I went to edit textgrid.py this afternoon, I realized that I have textgrid.py saved in more than one location on my computer. I think we must have been editing one that was saved in a different location last night (not the one that the aligner was using), I tried changing DEFAULT_PRECISION to 4 in the textgrid.py file that the aligner was using (/usr/local/lib/python3.6/ site-packages/textgrid/textgrid.py) and the aligner ran! Then I tried it with DEFAULT_PRECISION set at 8 and I got the same error mesage. At 6, the aligner still ran.

Marianne

On Apr 11, 2017, at 10:37 AM, Kedersha notifications@github.com wrote:

Yeah, DEFAULT_PRECISION didn't help us anywhere in the range between 5-50.

Thanks so much for the quick response! @mariannehuijsmans < https://github.com/mariannehuijsmans> what happens when you try this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/ prosodylab/Prosodylab-Aligner/issues/59#issuecomment-293339896>, or mute the thread https://github.com/notifications/unsubscribe-auth/ AQRPaGBEbzW9O0HWPtiDjrfwWIjomTPTks5ru7pngaJpZM4M5odD.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/59# issuecomment-293423949, or mute the thread https://github.com/notifications/unsubscribe-auth/ AAJuOZoNF9mN8JAuf-yHJ5CxHWIqn04Zks5rvAVBgaJpZM4M5odD .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/prosodylab/Prosodylab-Aligner/issues/59# issuecomment-293426341>, or mute the thread < https://github.com/notifications/unsubscribe-auth/AQRPaHCdi b01lnQkGOPhLPZqUPOCbNG6ks5rvAhygaJpZM4M5odD>.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/59#issuecomment-293430243, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuORtFZnBtBk_tpA5_AM5Vcsx5_F8Oks5rvA3tgaJpZM4M5odD .

mmcauliffe commented 7 years ago

Sure, I can do that! Version number is based off that in setup.py, so I can do a PR for that, or you can update it more easily. Maybe also we can update the README to specify the easier installation of pip install textgrid with the pip+git for installing the development version? Also, I can add you as an owner to the pypi repo for the future (and you should be on there just for principle's sake regardless). Do you have a pypi account?

kylebgorman commented 7 years ago

On Wed, Apr 12, 2017 at 3:27 PM, Michael McAuliffe <notifications@github.com

wrote:

Sure, I can do that! Version number is based off that in setup.py, so I can do a PR for that, or you can update it more easily.

Just push a commit that bumps the version number.

Maybe also we can update the README to specify the easier installation of pip install textgrid with the pip+git for installing the development version?

Suppose we could, but this really is pretty stable---first major change in 2 years.

Also, I can add you as an owner to the pypi repo for the future (and you should be on there just for principle's sake regardless). Do you have a pypi account?

Sure, feel free to add me to. I'm kylebgorman on pypi.

mmcauliffe commented 7 years ago

Ok, updated on pypi and added you as owner!

kylebgorman commented 7 years ago

And uploaded.

On Wed, Apr 12, 2017 at 3:57 PM, Michael McAuliffe <notifications@github.com

wrote:

Ok, updated on pypi and added you as owner!

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/59#issuecomment-293689952, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOTj2J7pqgIBYdg-JQ425zNLH9yB7ks5rvSyrgaJpZM4M5odD .