Closed wantee closed 8 years ago
Can you make sure there is an example of the use of the --bypass-metaparameter-optimization option somewhere in the egs?
I'll also wait for Ke to test this. I plan to merge soon but please don't let the wait block you from working on other issues.
The swbd test is done and the perplexities match those I got before. The swbd_fisher script is still running. By far there is no error except for the extra slash in log files as below (though not a big issue):
format_arpa_lm.py: succeeded formatting ARPA lm from data/lm//20000_5_prune0.25.pocolm
Ke
On Sat, Aug 6, 2016 at 3:55 PM, Daniel Povey notifications@github.com wrote:
I'll also wait for Ke to test this. I plan to merge soon but please don't let the wait block you from working on other issues.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/38#issuecomment-238045792, or mute the thread https://github.com/notifications/unsubscribe-auth/ANVxStDl1cE7majORPqbjgs3v6VBa81Vks5qdOalgaJpZM4JeNma .
Ke Li Dept. of Electrical and Computer Engineering Johns Hopkins University Email: kli26@jhu.edu
Dan,
Sure, I'll add --bypass-metaparameter-optimization at the end of run.sh and diff the perplexity. And I'm woking on the new prune-by-size method.
Ke,
format_arpa_lm.py is a different call outside train_lm.py, so i didn't change the line in run.sh. I think it outputs the same log as before.
Added example of the use of the --bypass-metaparameter-optimization. Ke, would you please figure out the corresponding numbers for swbd_fisher, if Dan is satisfied with this form of example.
Actually I'll go ahead and merge this, you can deal with the --fold-dev-into option later.
added --fold-dev-into option.
another thing is whether to change the codes in egs/tedlium/run.sh. Because the parameters are not same with swbd, like --progress-tolerance=2.0e-04, and there are lots of time profiling stuffs, I'm not sure how to modify it.
Thanks, merged the change. Don't bother about copying all the details of the tedlium setup, and messing too much with the tolerances. These aren't super critical, they affect the tradeoff between speed and accuracy. This will be less important if we have the --bypass-parameter-optimization option. Actually if you added the appropriate --bypass-parameter-optimization flag as a commented-out thing in the scripts just like 'fold_dev_into', it will make it easy to run quickly. It would also make it easier to test that the --bypass-parameter-optimization flag is working.
On Mon, Aug 8, 2016 at 7:06 PM, Wang Jian notifications@github.com wrote:
added --fold-dev-into option.
another thing is whether to change the codes in egs/tedlium/run.sh. Because the parameters are not same with swbd, like --progress-tolerance=2.0e-04, and there are lots of time profiling stuffs, I'm not sure how to modify it.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/38#issuecomment-238433111, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu-cRs_rctr3jMvHVGLbSBwzzXCg8ks5qd-CkgaJpZM4JeNma .
OK, I added a bypass_metaparam_optim_opt in run.sh. But the default values for swbd_fisher still need to be set to the appropriate ones.
Sorry for this late reply. The metaparameters for swbd_fisher are as below:
count_scale_1 0.0910712699104
count_scale_2 0.866887317645
order2_D1 0.75343260715
order2_D2 0.274543317257
order2_D3 0.0997520780102
order2_D4 0.0178910004304
order3_D1 0.90243712073
order3_D2 0.370544230191
order3_D3 0.183010891246
order3_D4 0.0701774016371
Do I need to update them in run.sh? Ke
On Sun, Aug 7, 2016 at 12:03 AM, Wang Jian notifications@github.com wrote:
Added example of the use of the --bypass-metaparameter-optimization. Ke, would you please figure out the corresponding numbers for swbd_fisher, if Dan is satisfied with this form of example.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/38#issuecomment-238062199, or mute the thread https://github.com/notifications/unsubscribe-auth/ANVxSn4I6RCuSCJYIEJOdv-dJGYYB3U3ks5qdVkdgaJpZM4JeNma .
Ke Li Dept. of Electrical and Computer Engineering Johns Hopkins University Email: kli26@jhu.edu
Thank you. There should be a line in output log of train_lm giving you the formatted string of --bypass-metaparameter-optimization option. you can directly copy that into run.sh. And please test whether it works fine[the ppl should be almost same with original LM].
On Friday, 12 August 2016, Ke Li notifications@github.com wrote:
Sorry for this late reply. The metaparameters for swbd_fisher are as below:
count_scale_1 0.0910712699104
count_scale_2 0.866887317645
order2_D1 0.75343260715
order2_D2 0.274543317257
order2_D3 0.0997520780102
order2_D4 0.0178910004304
order3_D1 0.90243712073
order3_D2 0.370544230191
order3_D3 0.183010891246
order3_D4 0.0701774016371
Do I need to update them in run.sh? Ke
On Sun, Aug 7, 2016 at 12:03 AM, Wang Jian <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:
Added example of the use of the --bypass-metaparameter-optimization. Ke, would you please figure out the corresponding numbers for swbd_fisher, if Dan is satisfied with this form of example.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/38#issuecomment-238062199, or mute the thread https://github.com/notifications/unsubscribe- auth/ANVxSn4I6RCuSCJYIEJOdv-dJGYYB3U3ks5qdVkdgaJpZM4JeNma .
Ke Li Dept. of Electrical and Computer Engineering Johns Hopkins University Email: kli26@jhu.edu javascript:_e(%7B%7D,'cvml','kli26@jhu.edu');
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/38#issuecomment-239297323, or mute the thread https://github.com/notifications/unsubscribe-auth/AGphTKTXgpITOpfmKuSM9UqvE5mIv7Rtks5qe5MCgaJpZM4JeNma .
I tested it on a trigram model on swbd_fisher. The ppl using these parameters is almost same as before. Ke
On Thu, Aug 11, 2016 at 8:50 PM, Wang Jian notifications@github.com wrote:
Thank you. There should be a line in output log of train_lm giving you the formatted string of --bypass-metaparameter-optimization option. you can directly copy that into run.sh. And please test whether it works fine[the ppl should be almost same with original LM].
On Friday, 12 August 2016, Ke Li notifications@github.com wrote:
Sorry for this late reply. The metaparameters for swbd_fisher are as below:
count_scale_1 0.0910712699104
count_scale_2 0.866887317645
order2_D1 0.75343260715
order2_D2 0.274543317257
order2_D3 0.0997520780102
order2_D4 0.0178910004304
order3_D1 0.90243712073
order3_D2 0.370544230191
order3_D3 0.183010891246
order3_D4 0.0701774016371
Do I need to update them in run.sh? Ke
On Sun, Aug 7, 2016 at 12:03 AM, Wang Jian <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:
Added example of the use of the --bypass-metaparameter-optimization. Ke, would you please figure out the corresponding numbers for swbd_fisher, if Dan is satisfied with this form of example.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/38#issuecomment-238062199, or mute the thread https://github.com/notifications/unsubscribe- auth/ANVxSn4I6RCuSCJYIEJOdv-dJGYYB3U3ks5qdVkdgaJpZM4JeNma .
Ke Li Dept. of Electrical and Computer Engineering Johns Hopkins University Email: kli26@jhu.edu javascript:_e(%7B%7D,'cvml','kli26@jhu.edu');
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/38#issuecomment-239297323, or mute the thread https://github.com/notifications/unsubscribe-auth/ AGphTKTXgpITOpfmKuSM9UqvE5mIv7Rtks5qe5MCgaJpZM4JeNma .
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/38#issuecomment-239335514, or mute the thread https://github.com/notifications/unsubscribe-auth/ANVxSmV5rJeNSeWG1IoXlw9fb6jCQhiwks5qe8NHgaJpZM4JeNma .
Ke Li Dept. of Electrical and Computer Engineering Johns Hopkins University Email: kli26@jhu.edu
OK. I think you can update them.
Add top-level train_lm.py for training a language model.