prosodylab / Prosodylab-Aligner

Python interface for forced audio alignment using HTK and SoX
http://prosodylab.org/tools/aligner/
MIT License
331 stars 77 forks source link

Words with apostrophes not recognized #49

Closed majure closed 8 years ago

majure commented 8 years ago

Some of my .lab transcripts have words with apostrophes, and they are being placed in the OOV list despite being in the dictionary. For example, DON'T. Do the apostrophes need to be escaped in the transcript somehow? I can't find any documentation about this.

kylebgorman commented 8 years ago

Sometimes this indicates that the transcript has a "smart quote" in it, (https://en.wikipedia.org/wiki/Quotation_mark#Quotation_marks_in_English), whereas the dictionary has the ASCII quote character (decimal char 39).

But if that's not it, and you're sure, then please email me a minimal workable example and I'll be glad to debug tomorrow.

majure commented 8 years ago

That was it. Thank you.

kylebgorman commented 8 years ago

Glad it was something relatively simple ;)