cmusphinx / sphinxtrain

Acoustic model trainer for CMU Sphinx
Other
178 stars 112 forks source link

SphinxTrain::Util::TiedWaitForConvergence() is broken in many ways #33

Closed dhdaines closed 2 years ago

dhdaines commented 2 years ago

With large numbers of Queue::POSIX parts, iteration 3 will never run as norm_and_launch_bw.pl gets stuck waiting on zombies. This may be due to some difference between WaitForConvergence() used in iterations 2..N and TiedWaitForConvergence() used in iteration 1, which effectively waits for all of the subsequent iterations to complete.

Also it doesn't correctly retrieve the loss value in step 30, at least not when the script is run manually.

It's pretty much a mess (for which I am entirely responsible, 10 years ago)