Error on ctm2tg.py - doesn't generate tier 2 and stops the process

gicraveiro commented 1 year ago

Continuing from issue " expose beam and retry-beam, G2P isolated consonants parsing #6 " long utts such as >10min long fail to be aligned just because they're too long, so the decoder doesn't even generate a lattice. Suggestion of solution: Increasing beam parameters seems to solve this issue, so the flags must somehow be exposed.

Increasing parameters indeed solved the problem in most cases. However, for the attached files, the code still fails when beam parameters are raised. Tests were conducted with values:

beam 40 and retry beam 100
beam 80 and retry beam 200
beam 160 and retry beam 400
beam 320 and retry beam 800

Ufpalign indicates the following error:

Traceback (most recent call last): File "/home/giovana/kaldi/egs/UFPAlign/s5/local/ctm2tg.py", line 326, in <module> f.write(tg.get_itemcontent(item, tokenlist, start, finish)) File "/home/giovana/kaldi/egs/UFPAlign/s5/local/ctm2tg.py", line 138, in get_itemcontent raise ValueError('ok this is odd. probably a bug') ValueError: ok this is odd. probably a bug

The files used as input to generate this error are attached. The textgrid generated is attached as well. Github doesn't support wav or textgrid files so I compressed those to zip. SP_EF_156_clipped_4.txt SP_EF_156_clipped_4_audio.zip EF_156_clipped_4_textgrid.zip

cassiotbatista commented 1 year ago

Hi,

Thanks for reaching out.

As mentioned via email, the problem is indeed with the syllablic-phonetic tier. It seems FalaBrasil's annotator has some trouble parsing multi-hyphenated words.

$ java -jar fb_nlplib.jar --g2p-s "bumba-meu-boi"
bumba-meu-boi   bu~-ba-me-'wbo

Removing the dashes seems to solve it.

$ java -jar fb_nlplib.jar --g2p-s "bumba meu boi"
bumba meu boi   'bu~-ba 'mew 'boj

Doing that directly at the *.txt transcription file you sent and using a --beam 30 --retry-beam 250 with tri2b acoustic model, I could generate the following textgrid.

EF_156_clipped_4.TextGrid.txt

Until I submit a fix for that, can you confirm whether removing some dashes work for that same file?

cassiotbatista commented 1 year ago

Hi @gicraveiro ,

Can you check if the version of this branch works without removing hyphens? A flag --no-syllphones can now bypass that tier on ctm2tg.

$ KALDI_ROOT=$HOME/work/git/kaldi bash ufpalign.sh \
    --beam 20 --retry-beam 120 --no-syllphones true 
    /tmp/SP_EF_156_clipped_4.wav /tmp/SP_EF_156_clipped_4.txt mono

utils/check_dependencies.sh: success!
[2023-04-25 13:30:06] ufpalign.sh: downloading models
utils/download_model.sh: file '/opt/UFPAlign/data.tar.gz' exists. skipping download
utils/download_model.sh: file '/opt/UFPAlign/mono.tar.gz' exists. skipping download
[2023-04-25 13:30:07] ufpalign.sh: preparing data
...
[2023-04-25 13:30:53] ufpalign.sh: creating textgrid with *no* syllphones tier
ctm2tg_nosyllphones.py     INFO textgrid files will be stored under "/l/disk0/ctbatista/work/git/kaldi/egs/UFPAlign/s5/data/tg" dir
ctm2tg_nosyllphones.py     INFO loading lex from data/dict/lexicon.txt
ctm2tg_nosyllphones.py     INFO loading syll from data/dict/syllphones.txt
ctm2tg_nosyllphones.py     INFO processing .grapheme file
ctm2tg_nosyllphones.py     INFO writing data/tg/EF_156_clipped_4 textgrid file
ctm2tg_nosyllphones.py     INFO processing file EF_156_clipped_4
ctm2tg_nosyllphones.py     INFO writing tier 0 (fonemeas)
ctm2tg_nosyllphones.py     INFO writing tier 1 (palavras-grafemas)
ctm2tg_nosyllphones.py     INFO writing tier 2 (frase-fonemas)
ctm2tg_nosyllphones.py     INFO writing tier 3 (frase-grafemas)
ctm2tg_nosyllphones.py     INFO done!
[2023-04-25 13:30:53] ufpalign.sh: success!

EF_156_clipped_4.TextGrid.txt

gicraveiro commented 1 year ago

Hello @cassiotbatista!

Thank you very much!

I tested the bypass syllphones flag with the EF_156_clipped_4 files with hyphens, beam 20, retry beam 120, exactly as you ran it and it worked perfectly.

Unfortunately, I got stuck when trying a different group of files. It doesn't seem to be the same issue because the code doesn't even arrive as far as before. I extracted the following message from the log file, it seems to capture every log message after things started to go wrong:

I would hope increasing beam and retry beam would solve this issue, but so far all values I tried failed... I tried:

beam 10 retry-beam 40 mono --no-syllphones true
beam 20 retry-beam 80 mono --no-syllphones true
beam 40 retry-beam 160 mono --no-syllphones true
beam 80 retry-beam 320 mono --no-syllphones true
beam 160 retry-beam 640 mono --no-syllphones true
beam 20 retry-beam 120 mono --no-syllphones true
beam 30 retry-beam 250 mono --no-syllphones true

Do you have any ideas of what might help in this case? I attached the text file that I used as input. The audio file was too big to send here, I will send it through email.

Thanks again!

SP_DID_111_1_clipped.txt

gicraveiro commented 1 year ago

Hello!

As you suggested by email, removing the noise from the audios solved the problem! I used Adobe's enhancement feature and the textgrid of the files I needed was correctly generated with all layers, mono acoustic model, without modifying the parameters beam and retry-beam.

Thank you so much for the support! I'm closing the issue.

falabrasil / ufpalign

Error on ctm2tg.py - doesn't generate tier 2 and stops the process #11