prosodylab / Prosodylab-Aligner

Python interface for forced audio alignment using HTK and SoX
http://prosodylab.org/tools/aligner/
MIT License
331 stars 77 forks source link

TextGrids instead of Lab files #86

Closed rolandomunoz closed 4 years ago

rolandomunoz commented 4 years ago

Instead of creating TextGrids from scratch from .lab files when aligning segments, would it be possible to use already existing TextGrid files?

I mean, the TextGrids would contain the sentence or word to be aligned and the output of the aligner would be a new tier with the segmented information.

Thank you in advanced and for your wonderful program!

kylebgorman commented 4 years ago

Just take the textgrid files, make lab files out of them. The TextGrid library (of which I'm the primary author, and which is one of the dependencies in this project) should make thiat a relatively striaghtforward thing to do in Python.

One could code up a patch/pull request which enabled that functionality, but I think it'd be a lot of work for something that can be done once offline. Generally speaking, the format of a .lab file is verty straightforward, but there are a lot of open questions in how a TextGrid file should be mapped onto a list of words. (I.e., which tier should be used if there's more than one?)

rolandomunoz commented 4 years ago

Thank you for your answer. In my case, things are simple. My TextGrid files contain various labeled intervals (from Swadesh list) at different times in only one tier. Like this:

TextGrid tier: | |cat| |dog| |rabbit| |tiger| |

The last time I used the aligner I had to follow various tricky steps to merge the aligned TextGrid files with my old TextGrids (that also contain other tiers with other type of information). I will look to the TextGrid library. It sounds interesting.

Sorry, it was not clear to me when you wrote (English is not my first language):

but I think it'd be a lot of work for something that can be done once offline

Could you clarify this point? Thank you

kylebgorman commented 4 years ago

On Fri, Aug 14, 2020 at 11:18 PM Rolando Muñoz notifications@github.com wrote:

Thank you for your answer. In my case, things are simple. My TextGrid files contain various labeled intervals (from Swadesh list) at different times in only one tier. Like this:

TextGrid tier: | |cat| |dog| |rabbit| |tiger| |

The last time I used the aligner I had to follow various tricky steps to merge the aligned TextGrid files with my old TextGrids (that also contain other tiers with other type of information). I will look to the TextGrid library. It sounds interesting.

It should make that step slightly easier, the textgrid library.

Sorry, English is not my first language. I did not understand when you wrote:

but I think it'd be a lot of work for something that can be done once offline

I mean that it would be a great deal of work to enable use of textgrids as inputs to the aligner; it is better, I think, to ask users to simply turn their textgrids into label files.

K

rolandomunoz commented 4 years ago

Thank you. I'll see what I can do with the textgrid library!