Closed benjisympa closed 4 years ago
Hi,
I have seen this sort of issue before (it's in the textgrid library here: https://github.com/kylebgorman/textgrid) but I thought we'd dealt with it before. It happens during creation of a textgrid when two intervals appear to overlap in the textgrid. In this case they don't really overlap, it's just that the representation of floating point numbers is very approximate.
For this reason in textgrid.py (near the top) we define separate precisions for TextGrid inputs and for MLF (i.e., from Prosodylab-Aligner) inputs. I suspect if you set
DEFAULT_MLF_PRECISION = 5
to a lower value (say 3), the issue will go away? (You will have to do this wherever textgrid.py is installed for Python 3.) Please try it out and report back, and if this helps, I can commit the fix.
On Tue, Nov 28, 2017 at 1:19 PM, benjisympa notifications@github.com wrote:
Good Morning, thank you for your work.
I've launched the soft but I have an error. Have you ever seen this ?
(py3) ✘ maurice@client-172-18-65-151 ~/Prosodylab-Aligner master python -m aligner -a data Traceback (most recent call last): File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/maurice/Prosodylab-Aligner/aligner/main.py", line 134, in size = MLF(aligned).write(args.align) File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/site- packages/textgrid/textgrid.py", line 788, in init self.read(f, samplerate) File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/site- packages/textgrid/textgrid.py", line 830, in read phon.add(pmin, pmax, line[2]) File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/site- packages/textgrid/textgrid.py", line 433, in add self.addInterval(interval) File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/site- packages/textgrid/textgrid.py", line 441, in addInterval i = bisect_left(self.intervals, interval) File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/site- packages/textgrid/textgrid.py", line 208, in lt raise (ValueError(self, other)) ValueError: (Interval(0.00000, 155.7200012, sil), Interval(155.72000, 155.89000, S)) (py3) ✘ maurice@client-172-18-65-151 ~/Prosodylab-Aligner master
Thank you very much.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/66, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuObj2JX7ds5Ob_bHJKoI9l1JZ3Sveks5s7E6bgaJpZM4QtsDD .
Hi, thank you for your answer. I modify the file in : /Users/maurice/anaconda3/envs/py3/lib/python3.6/site-packages/textGrid but in ValueError: (Interval(0E-10, 155.7200012000, sil), Interval(155.7200000000, 155.8900000000, S)) the second number is still bigger than the fourth (155.7200012000 > 155.7200000000), I try to change the precision but there is no impact. I try 3, 5 and 10. Here it's with 10. I don't know what is the sil or S but maybe I need to round up the sil value ? I don't need so much precision I think for the 12 in 155.7200012000.
Yeah, we need some way to round down the sil's second value. It really ought to be exactly 155.72 with all trailing zeros, and I'm not sure why it isn't. (Do you know what value it corresponds to in the MLF file? Those are in, iirc, in units of 100 nanoseconds, and we attempt to convert from that to seconds). Perhaps there's some issue there.
On Mon, Dec 4, 2017 at 10:36 AM, benjisympa notifications@github.com wrote:
Hi, thank you for your answer. I modify the file in : /Users/maurice/anaconda3/envs/ py3/lib/python3.6/site-packages/textGrid but in ValueError: (Interval(0E-10, 155.7200012000, sil), Interval(155.7200000000, 155.8900000000, S)) the second number is still bigger than the fourth (155.7200012000 > 155.7200000000), I try to change the precision but there is no impact. I try 3, 5 and 10. Here it's with 10. I don't know what is the sil or S but maybe I need to round up the sil value ? I don't need so much precision I think for the 12 in 155.7200012000.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/66#issuecomment-348998734, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOesCF--rT-PZHhguuVemk8xz96p6ks5s9BF_gaJpZM4QtsDD .
Yes I think it's the problem, but I don't know MLF file, I just have the audio and the transcription of the first episode of Big Bang Theory. Maybe we can add a round directly into the code ?
Just FIY: the MLF file is the temporary file generated by the shell call to HVite during the final alignment. It's stored in a temporary directory. If you want you can hack the aligner to log the location of the temporary file, and to not delete it at the end, then you can inspect it if you wish.
On Tue, Dec 5, 2017 at 8:48 AM, benjisympa notifications@github.com wrote:
Yes I think it's the problem, but I don't know MLF file, I just have the audio and the transcription of the first episode of Big Bang Theory. Maybe we can add a round directly into the code ?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/66#issuecomment-349308834, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOar49GZ5ZIuLTUDRTzUd60pO29Thks5s9UmhgaJpZM4QtsDD .
Ok thanks, I've modified the script. I have 2 empty folders and one file, is it what you thought ? :
(py3) ✘ maurice@client-172-18-65-151 ~/Prosodylab-Aligner master ● more tmp/tmpkr9z6j3y/HERest.cfg CEPLIFTER = 22 ENORMALIZE = T NUMCEPS = 12 NUMCHANS = 20 PREEMCOEF = 0.97 TARGETKIND = MFCC_D_A_0 TARGETRATE = 100000.0 USEHAMMING = T WINDOWSIZE = 250000.0 (py3) maurice@client-172-18-65-151 ~/Prosodylab-Aligner master ● ls tmp/tmpkr9z6j3y/ 000 001 HERest.cfg (py3) maurice@client-172-18-65-151 ~/Prosodylab-Aligner master ●
the file in question is hidden, it's called .aligned.mlf. (so use ls -a to find it)
Actually on looking more carefully, it looks like it'll be in your TextGrids output directory. Sorry for the mis-direction there!
On Tue, Dec 5, 2017 at 11:22 AM, benjisympa notifications@github.com wrote:
Ok, I modify the script. I have 2 empty folders and one file, is it what you thought ? :
(py3) ✘ maurice@client-172-18-65-151 ~/Prosodylab-Aligner master ● more tmp/tmpkr9z6j3y/HERest.cfg CEPLIFTER = 22 ENORMALIZE = T NUMCEPS = 12 NUMCHANS = 20 PREEMCOEF = 0.97 TARGETKIND = MFCC_D_A_0 TARGETRATE = 100000.0 USEHAMMING = T WINDOWSIZE = 250000.0 (py3) maurice@client-172-18-65-151 ~/Prosodylab-Aligner master ● ls tmp/tmpkr9z6j3y/ 000 001 HERest.cfg (py3) maurice@client-172-18-65-151 ~/Prosodylab-Aligner master ●
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/66#issuecomment-349356947, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOZIWwcOLWp8BKMdUlmSKj7klk5odks5s9W3RgaJpZM4QtsDD .
Sry for my delay to answer. I don't know where is the TextGrids output directory but I have nothing in the GitHub project folder, where I clone the projet and where I put the data.
It is the same as the input directory for the .lab and .wav files.
On Wed, Dec 20, 2017 at 12:19 AM, benjisympa notifications@github.com wrote:
Sry for my delay to answer. I don't know where is the TextGrids output directory but I have nothing in the GitHub project folder, where I clone the projet and where I put the data.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/66#issuecomment-352789269, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOTWg36mdgYZlJ1yyMFKblHKV0eaZks5tB9QBgaJpZM4QtsDD .
Great, I found him !
more data/.aligned.mlf
"/var/folders/hg/gc5hq9ln10v4htbsj2_k15j40000gp/T/tmpxze4evhf/audio/TheBigBangTheory.Season01.Episode01.en.upper.lab" 0 1557200012 sil sil 1557200000 1558900000 S SO 1558900000 10409999854 OW1 10409999854 10410699854 sp 10410700000 13195299915 sil sil .
Sorry to let this linger but we a contributor just added a new way of rounding that I hope may help with your issue? Please try locally when you get a chance. -K
On Fri, Dec 22, 2017 at 8:58 AM, benjisympa notifications@github.com wrote:
Great, I found him !
more data/.aligned.mlf
!MLF!
"/var/folders/hg/gc5hq9ln10v4htbsj2_k15j40000gp/T/tmpxze4evhf/ audio/TheBigBangTheory.Season01.Episode01.en.upper.lab" 0 1557200012 sil sil 1557200000 1558900000 S SO 1558900000 10409999854 OW1 10409999854 10410699854 sp 10410700000 13195299915 <(319)%20529-9915> sil sil .
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/66#issuecomment-353603307, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOblOW7XoC2AVyeQgAPdRnFZUZY8qks5tC7WSgaJpZM4QtsDD .
No problem, I try with gentle/kaldi and it seems to work so it's good for my job. I can pull and try if you want. Thanks.
Good Morning, thank you for your work.
I've launched the soft but I have an error. Have you ever seen this ?
(py3) ✘ maurice@client-172-18-65-151 ~/Prosodylab-Aligner master python -m aligner -a data Traceback (most recent call last): File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/maurice/Prosodylab-Aligner/aligner/main.py", line 134, in
size = MLF(aligned).write(args.align)
File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/site-packages/textgrid/textgrid.py", line 788, in init
self.read(f, samplerate)
File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/site-packages/textgrid/textgrid.py", line 830, in read
phon.add(pmin, pmax, line[2])
File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/site-packages/textgrid/textgrid.py", line 433, in add
self.addInterval(interval)
File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/site-packages/textgrid/textgrid.py", line 441, in addInterval
i = bisect_left(self.intervals, interval)
File "/Users/maurice/anaconda3/envs/py3/lib/python3.6/site-packages/textgrid/textgrid.py", line 208, in lt
raise (ValueError(self, other))
ValueError: (Interval(0.00000, 155.7200012, sil), Interval(155.72000, 155.89000, S))
(py3) ✘ maurice@client-172-18-65-151 ~/Prosodylab-Aligner master
Thank you very much.