Closed DLZML001 closed 4 years ago
I believe you can use transcripts where each file consists just of "sil".
But I'm not sure it's correct to interpret the warning message as saying necessarily just that "you need more silence", just as a note that there aren't any silence "phones" at some stage during training.
On Thu, Apr 26, 2018 at 3:54 AM, ziadah notifications@github.com wrote:
I am retraining the model on my data (several hours of audio text pairs). However I get the warning that there is not enough 'sil' and 'sp'
WARNING [-2331] UpdateModels: sp[25] copied: only 0 egs in HERest WARNING [-2331] UpdateModels: sil[70] copied: only 0 egs in HERest
I am wondering how can I train the model with more silence. Is it possible to add audio noise files without speech, and then keep the corresponding .lab file blank?
Sincerely,
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/70, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOcwdKKlWy6I1QLBNMbuultqhWLAqks5tsX1DgaJpZM4TksK1 .
ok thanks, I ll try.
Also how to get rid of the rounding issue? ValueError: (Interval(0.00000, 1.4400001, sil), Interval(1.44000, 1.48000, d))
I could find the file creating the issue inside the .aligned.mlf and I deleted it.
But is there another way to get rid of the problem?
We're a little confused why it happens but it's possible to turn off strict checking for rounding, as follows:
/usr/local/lib/python3.6/site-packages/textgrid/textgrid.py
On lines 192 and 410, you will see:
self.strict = True
If you change those to read:
self.strict = False
It should at least ignore this error, I believe.
On Thu, Apr 26, 2018 at 9:35 AM, ziadah notifications@github.com wrote:
ok thanks, I ll try.
Also how to get rid of the rounding issue? ValueError: (Interval(0.00000, 1.4400001, sil), Interval(1.44000, 1.48000, d))
I could find the file creating the issue inside the .aligned.mlf and I deleted it.
But is there another way to get rid of the problem?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/70#issuecomment-384643025, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOceAMBIXnultgrFKrGExYq61j7QFks5tsc0jgaJpZM4TksK1 .
Thanks, the self.strict trick works for the rounding errors. However I was not able to ass any sil files.
For faster training is it possible to train it using GPU?
It's quite possible in theory!
But HTK, the backend here, is ancient: 32-bit based Windows-oriented C written decades ago so it'd be a very heavy lift to make this library GPU-ready.
On Thu, May 3, 2018 at 11:55 AM ziadah notifications@github.com wrote:
For faster training is it possible to train it using GPU?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/70#issuecomment-386344561, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOeFtsx-EN5yyibnHKdtxyh54YWJaks5tuyhvgaJpZM4TksK1 .
Ok thanks for your answer
when training the model on a large number of files (around 500 000 files), I had the following error: ERROR [+7031] PutTransMat: Row 2 of transition mat sum = 0.954956
FATAL ERROR - Terminating program HERest What solution could you suggest to train with a large number of files.
This can happen due to a known deficiency in the underlying HTK library when using large amounts of data. I believe it's discussed on one of those closed issues on the repo.
It's a "do not fix" for us because it's an issue in HTK, not this module, and would have to be fixed upstream.
On Fri, May 18, 2018, 4:35 PM ziadah notifications@github.com wrote:
when training the model on a large number of files (around 500 000 files), I had the following error: ERROR [+7031] PutTransMat: Row 2 of transition mat sum = 0.954956
FATAL ERROR - Terminating program HERest What solution could you suggest to train with a large number of files.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prosodylab/Prosodylab-Aligner/issues/70#issuecomment-390121849, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOVh8QLXCwwqH2qLVMq6ErfcoaArCks5tznnBgaJpZM4TksK1 .
Closing as a "do not fix".
I am retraining the model on my data (several hours of audio text pairs). However I get the warning that there is not enough 'sil' and 'sp'
WARNING [-2331] UpdateModels: sp[25] copied: only 0 egs in HERest WARNING [-2331] UpdateModels: sil[70] copied: only 0 egs in HERest
I am wondering how can I train the model with more silence. Is it possible to add audio noise files without speech, and then keep the corresponding .lab file blank?
Sincerely,