danpovey / pocolm

Small language toolkit for creation, interpolation and pruning of ARPA language models
Other
91 stars 48 forks source link

WIP: add prune_size_model #45

Closed wantee closed 8 years ago

wantee commented 8 years ago

Dan, Currently, I only implemented the core internal/prune_size_model.py. If you have time, would you please check whether the logic framework is correct? I'll continue to modify the prune_lm_dir.py.

danpovey commented 8 years ago

Is this partial work? If so please adjust the name to 'WIP: add prune_size_model'

danpovey commented 8 years ago

it generally looks plausible. maybe some of the wording in the comments could be changed but I could help with that later.

wantee commented 8 years ago

OK, I'll also try to write more useful comments.

BTW, can you suggest a suitable quasi-prune function to test the class. I don't like the simple one in current code.

danpovey commented 8 years ago

hard to test- maybe first write real program. going to bed now.

On Tue, Aug 9, 2016 at 12:06 AM, Wang Jian notifications@github.com wrote:

OK, I'll also try to write more useful comments.

BTW, can you suggest a suitable quasi-prune function to test the class. I don't like the simple one in current code.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/45#issuecomment-238471323, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu2YYJT72g6bWtfVOtH25rCu8kLQ4ks5qeCcCgaJpZM4JfxL9 .

wantee commented 8 years ago

Finished the initial version of PruneSizeModel.

With the default setting on swbd 3-gram model, it need 7 iterations to converge:

PruneSizeModel: Iter 0: threshold=0.0, num_xgrams=1828213
PruneSizeModel: Iter 1: threshold=0.25, num_xgrams=608455
PruneSizeModel: Iter 1: target_num=374092.580059, modeled_next_num=374111.142571
PruneSizeModel: Iter 2: threshold=0.281774520874, num_xgrams=418717
PruneSizeModel: Iter 2: target_num=310330.99864, modeled_next_num=310346.677311
PruneSizeModel: Iter 3: threshold=0.335639132652, num_xgrams=361765
PruneSizeModel: Iter 3: target_num=288455.042884, modeled_next_num=288463.573352
PruneSizeModel: Iter 4: threshold=0.400906849732, num_xgrams=316690
PruneSizeModel: Iter 4: target_num=255931.574962, modeled_next_num=255939.211858
PruneSizeModel: Iter 5: threshold=0.474544487117, num_xgrams=276858
PruneSizeModel: Iter 5: target_num=244695.430977, modeled_next_num=244698.711088
PruneSizeModel: Iter 6: threshold=0.513385072885, num_xgrams=259890
PruneSizeModel: Iter 6: target_num=230001, modeled_next_num=230004.706081
PruneSizeModel: Iter 7: threshold=0.567989422411, num_xgrams=239591

I think the guess of modeled_next_num_xgrams is not so bad, we may need to go a larger step to get the intermediate target_num_xgrams?

By changing the xgrams_change_power from -1.0 to -0.2, it could trigger the backtracking mechanism to work:

PruneSizeModel: Iter 0: threshold=0.0, num_xgrams=1828213
PruneSizeModel: Iter 1: threshold=0.25, num_xgrams=608455
PruneSizeModel: Iter 1: target_num=374092.580059, modeled_next_num=374094.421565
PruneSizeModel: Iter 2: threshold=0.45482635498, num_xgrams=310551
PruneSizeModel: Iter 2: target_num=254263.719425, modeled_next_num=254263.719425
PruneSizeModel: Iter 3: threshold=0.45482635498, num_xgrams=289744
PruneSizeModel: Iter 3: target_num=248441.889239, modeled_next_num=248442.641158
PruneSizeModel: Iter 4: threshold=0.874157571525, num_xgrams=183880
PruneSizeModel: Iter 4: target_num=248441.889239, modeled_next_num=248443.164648
PruneSizeModel: Iter 5: threshold=0.737649286166, num_xgrams=206130
PruneSizeModel: Iter 5: target_num=248441.889239, modeled_next_num=248442.217727
PruneSizeModel: Iter 6: threshold=0.679342040967, num_xgrams=217906

As it can be seen, the backtracking happened in iter 4 & 5. I checked other stuffs in log, it seems work in a right way.

danpovey commented 8 years ago

Cool. Can you change the way the logs are printed so that instead of

PruneSizeModel: Iter 6: target_num=230001, modeled_next_num=230004.706081 PruneSizeModel: Iter 7: threshold=0.567989422411, num_xgrams=239591

it says:

PruneSizeModel: Iter 7: threshold=0.567989422411, num_xgrams=239591 [vs. intermediate target 230001]

[note: no need to print out modeled_next_num unless it's far from the target_num; then you can print it as a warning.]

This will make it easier for us to judge how to tune the parameters of the model. Try to tune the parameters (those constants, -1.0 and 1.5) so that sometimes the observed number is higher, and sometimes lower, than the intermediate target. Do this on a couple of setups.

Dan

On Tue, Aug 9, 2016 at 8:32 AM, Wang Jian notifications@github.com wrote:

Finished the initial version of PruneSizeModel.

With the default setting on swbd 3-gram model, it need 7 iterations to converge:

PruneSizeModel: Iter 0: threshold=0.0, num_xgrams=1828213 PruneSizeModel: Iter 1: threshold=0.25, num_xgrams=608455 PruneSizeModel: Iter 1: target_num=374092.580059, modeled_next_num=374111.142571 PruneSizeModel: Iter 2: threshold=0.281774520874, num_xgrams=418717 PruneSizeModel: Iter 2: target_num=310330.99864, modeled_next_num=310346.677311 PruneSizeModel: Iter 3: threshold=0.335639132652, num_xgrams=361765 PruneSizeModel: Iter 3: target_num=288455.042884, modeled_next_num=288463.573352 PruneSizeModel: Iter 4: threshold=0.400906849732, num_xgrams=316690 PruneSizeModel: Iter 4: target_num=255931.574962, modeled_next_num=255939.211858 PruneSizeModel: Iter 5: threshold=0.474544487117, num_xgrams=276858 PruneSizeModel: Iter 5: target_num=244695.430977, modeled_next_num=244698.711088 PruneSizeModel: Iter 6: threshold=0.513385072885, num_xgrams=259890 PruneSizeModel: Iter 6: target_num=230001, modeled_next_num=230004.706081 PruneSizeModel: Iter 7: threshold=0.567989422411, num_xgrams=239591

I think the guess of modeled_next_num_xgrams is not so bad, we may need to go a larger step to get the intermediate target_num_xgrams?

By changing the xgrams_change_power from -1.0 to -0.2, it could trigger the backtracking mechanism to work:

PruneSizeModel: Iter 0: threshold=0.0, num_xgrams=1828213 PruneSizeModel: Iter 1: threshold=0.25, num_xgrams=608455 PruneSizeModel: Iter 1: target_num=374092.580059, modeled_next_num=374094.421565 PruneSizeModel: Iter 2: threshold=0.45482635498, num_xgrams=310551 PruneSizeModel: Iter 2: target_num=254263.719425, modeled_next_num=254263.719425 PruneSizeModel: Iter 3: threshold=0.45482635498, num_xgrams=289744 PruneSizeModel: Iter 3: target_num=248441.889239, modeled_next_num=248442.641158 PruneSizeModel: Iter 4: threshold=0.874157571525, num_xgrams=183880 PruneSizeModel: Iter 4: target_num=248441.889239, modeled_next_num=248443.164648 PruneSizeModel: Iter 5: threshold=0.737649286166, num_xgrams=206130 PruneSizeModel: Iter 5: target_num=248441.889239, modeled_next_num=248442.217727 PruneSizeModel: Iter 6: threshold=0.679342040967, num_xgrams=217906

As it can be seen, the backtracking happened in iter 4 & 5. I checked other stuffs in log, it seems work in a right way.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/45#issuecomment-238591787, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu2FITmMX5n6imkHzSw0sQMd8JbPGks5qeJ2jgaJpZM4JfxL9 .

danpovey commented 8 years ago

I am not reviewing this in detail. Let me know when you think it's stable and ready to merge.

wantee commented 8 years ago

Sure, I still need to run some tests on different setups and tune the parameters.

wantee commented 8 years ago

For high order model, the resulting num_xgrams for the first several iterations seems related very weakly with the threshold.

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767
PruneSizeModel: Iter 1: threshold=0.250, num_xgrams=3796942 [vs. intermediate_target=0]
PruneSizeModel: Iter 2: threshold=1.889, num_xgrams=1665042 [vs. intermediate_target=949235]
PruneSizeModel: Iter 3: threshold=10.007, num_xgrams=398859 [vs. intermediate_target=465249]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767
PruneSizeModel: Iter 1: threshold=0.500, num_xgrams=3789398 [vs. intermediate_target=0]
PruneSizeModel: Iter 2: threshold=3.774, num_xgrams=1657940 [vs. intermediate_target=947349]
PruneSizeModel: Iter 3: threshold=19.895, num_xgrams=392239 [vs. intermediate_target=464256]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767
PruneSizeModel: Iter 1: threshold=1.000, num_xgrams=3783774 [vs. intermediate_target=0]
PruneSizeModel: Iter 2: threshold=7.542, num_xgrams=1654208 [vs. intermediate_target=945943]
PruneSizeModel: Iter 3: threshold=39.664, num_xgrams=388978 [vs. intermediate_target=463733]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767
PruneSizeModel: Iter 1: threshold=1.500, num_xgrams=3781468 [vs. intermediate_target=0]
PruneSizeModel: Iter 2: threshold=11.310, num_xgrams=1653113 [vs. intermediate_target=945367]
PruneSizeModel: Iter 3: threshold=59.442, num_xgrams=387952 [vs. intermediate_target=463579]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767
PruneSizeModel: Iter 1: threshold=2.500, num_xgrams=3779375 [vs. intermediate_target=0]
PruneSizeModel: Iter 2: threshold=18.843, num_xgrams=1652204 [vs. intermediate_target=944843]
PruneSizeModel: Iter 3: threshold=98.995, num_xgrams=387181 [vs. intermediate_target=463452]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767
PruneSizeModel: Iter 1: threshold=3.500, num_xgrams=3778248 [vs. intermediate_target=0]
PruneSizeModel: Iter 2: threshold=26.376, num_xgrams=1651831 [vs. intermediate_target=944562]
PruneSizeModel: Iter 3: threshold=138.548, num_xgrams=386877 [vs. intermediate_target=463400]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767
PruneSizeModel: Iter 1: threshold=4.500, num_xgrams=3777633 [vs. intermediate_target=0]
PruneSizeModel: Iter 2: threshold=33.909, num_xgrams=1651673 [vs. intermediate_target=944408]
PruneSizeModel: Iter 3: threshold=178.110, num_xgrams=386725 [vs. intermediate_target=463377]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767
PruneSizeModel: Iter 1: threshold=5.000, num_xgrams=3777391 [vs. intermediate_target=0]
PruneSizeModel: Iter 2: threshold=37.676, num_xgrams=1651605 [vs. intermediate_target=944347]
PruneSizeModel: Iter 3: threshold=197.888, num_xgrams=386660 [vs. intermediate_target=463368]

It is hard to predict next_num_xgrams with threshold and prev_num_xgrams. Any solution?

danpovey commented 8 years ago

I think what's going on here is that many of the n-grams are 'protected', meaning that they cannot be pruned because they lead to a state that still has n-grams coming out of it. That means that regardless of the threshold, they will not be pruned. This means that the model is inaccurate, but it may not matter in terms of the success of the search strategy overall.

Dan

On Fri, Aug 12, 2016 at 12:32 AM, Wang Jian notifications@github.com wrote:

For high order model, the resulting num_xgrams for the first several iterations seems related very weakly with the threshold.

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=0.250, num_xgrams=3796942 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=1.889, num_xgrams=1665042 [vs. intermediate_target=949235] PruneSizeModel: Iter 3: threshold=10.007, num_xgrams=398859 [vs. intermediate_target=465249]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=0.500, num_xgrams=3789398 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=3.774, num_xgrams=1657940 [vs. intermediate_target=947349] PruneSizeModel: Iter 3: threshold=19.895, num_xgrams=392239 [vs. intermediate_target=464256]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=1.000, num_xgrams=3783774 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=7.542, num_xgrams=1654208 [vs. intermediate_target=945943] PruneSizeModel: Iter 3: threshold=39.664, num_xgrams=388978 [vs. intermediate_target=463733]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=1.500, num_xgrams=3781468 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=11.310, num_xgrams=1653113 [vs. intermediate_target=945367] PruneSizeModel: Iter 3: threshold=59.442, num_xgrams=387952 [vs. intermediate_target=463579]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=2.500, num_xgrams=3779375 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=18.843, num_xgrams=1652204 [vs. intermediate_target=944843] PruneSizeModel: Iter 3: threshold=98.995, num_xgrams=387181 [vs. intermediate_target=463452]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=3.500, num_xgrams=3778248 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=26.376, num_xgrams=1651831 [vs. intermediate_target=944562] PruneSizeModel: Iter 3: threshold=138.548, num_xgrams=386877 [vs. intermediate_target=463400]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=4.500, num_xgrams=3777633 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=33.909, num_xgrams=1651673 [vs. intermediate_target=944408] PruneSizeModel: Iter 3: threshold=178.110, num_xgrams=386725 [vs. intermediate_target=463377]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=5.000, num_xgrams=3777391 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=37.676, num_xgrams=1651605 [vs. intermediate_target=944347] PruneSizeModel: Iter 3: threshold=197.888, num_xgrams=386660 [vs. intermediate_target=463368]

It is hard to predict next_num_xgrams with threshold and prev_num_xgrams. Any solution?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/45#issuecomment-239380201, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu1GuUisd15g2GVbqsmQwP_ceslqJks5qfCF6gaJpZM4JfxL9 .

danpovey commented 8 years ago

It might be a good idea to introduce a max-threshold-change-factor parameter (have it set to 4 for now) that will limit how much we allow the threshold to increase from iteration to iteration. Then when you do the binary search for the theshold, you can limit that search to the range [current-threshold, max_threshold_change_factor * current-threshold]. I think this will limit how much it overshoots.

Dan

On Fri, Aug 12, 2016 at 1:45 PM, Daniel Povey dpovey@gmail.com wrote:

I think what's going on here is that many of the n-grams are 'protected', meaning that they cannot be pruned because they lead to a state that still has n-grams coming out of it. That means that regardless of the threshold, they will not be pruned. This means that the model is inaccurate, but it may not matter in terms of the success of the search strategy overall.

Dan

On Fri, Aug 12, 2016 at 12:32 AM, Wang Jian notifications@github.com wrote:

For high order model, the resulting num_xgrams for the first several iterations seems related very weakly with the threshold.

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=0.250, num_xgrams=3796942 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=1.889, num_xgrams=1665042 [vs. intermediate_target=949235] PruneSizeModel: Iter 3: threshold=10.007, num_xgrams=398859 [vs. intermediate_target=465249]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=0.500, num_xgrams=3789398 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=3.774, num_xgrams=1657940 [vs. intermediate_target=947349] PruneSizeModel: Iter 3: threshold=19.895, num_xgrams=392239 [vs. intermediate_target=464256]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=1.000, num_xgrams=3783774 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=7.542, num_xgrams=1654208 [vs. intermediate_target=945943] PruneSizeModel: Iter 3: threshold=39.664, num_xgrams=388978 [vs. intermediate_target=463733]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=1.500, num_xgrams=3781468 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=11.310, num_xgrams=1653113 [vs. intermediate_target=945367] PruneSizeModel: Iter 3: threshold=59.442, num_xgrams=387952 [vs. intermediate_target=463579]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=2.500, num_xgrams=3779375 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=18.843, num_xgrams=1652204 [vs. intermediate_target=944843] PruneSizeModel: Iter 3: threshold=98.995, num_xgrams=387181 [vs. intermediate_target=463452]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=3.500, num_xgrams=3778248 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=26.376, num_xgrams=1651831 [vs. intermediate_target=944562] PruneSizeModel: Iter 3: threshold=138.548, num_xgrams=386877 [vs. intermediate_target=463400]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=4.500, num_xgrams=3777633 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=33.909, num_xgrams=1651673 [vs. intermediate_target=944408] PruneSizeModel: Iter 3: threshold=178.110, num_xgrams=386725 [vs. intermediate_target=463377]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=5.000, num_xgrams=3777391 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=37.676, num_xgrams=1651605 [vs. intermediate_target=944347] PruneSizeModel: Iter 3: threshold=197.888, num_xgrams=386660 [vs. intermediate_target=463368]

It is hard to predict next_num_xgrams with threshold and prev_num_xgrams. Any solution?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/45#issuecomment-239380201, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu1GuUisd15g2GVbqsmQwP_ceslqJks5qfCF6gaJpZM4JfxL9 .

danpovey commented 8 years ago

BTW, if you do this, you'll have to change the messages so that it predicts the intermediate-threshold and the model-prediction separately, because now they might be different.

On Fri, Aug 12, 2016 at 2:50 PM, Daniel Povey dpovey@gmail.com wrote:

It might be a good idea to introduce a max-threshold-change-factor parameter (have it set to 4 for now) that will limit how much we allow the threshold to increase from iteration to iteration. Then when you do the binary search for the theshold, you can limit that search to the range [current-threshold, max_threshold_change_factor * current-threshold]. I think this will limit how much it overshoots.

Dan

On Fri, Aug 12, 2016 at 1:45 PM, Daniel Povey dpovey@gmail.com wrote:

I think what's going on here is that many of the n-grams are 'protected', meaning that they cannot be pruned because they lead to a state that still has n-grams coming out of it. That means that regardless of the threshold, they will not be pruned. This means that the model is inaccurate, but it may not matter in terms of the success of the search strategy overall.

Dan

On Fri, Aug 12, 2016 at 12:32 AM, Wang Jian notifications@github.com wrote:

For high order model, the resulting num_xgrams for the first several iterations seems related very weakly with the threshold.

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=0.250, num_xgrams=3796942 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=1.889, num_xgrams=1665042 [vs. intermediate_target=949235] PruneSizeModel: Iter 3: threshold=10.007, num_xgrams=398859 [vs. intermediate_target=465249]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=0.500, num_xgrams=3789398 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=3.774, num_xgrams=1657940 [vs. intermediate_target=947349] PruneSizeModel: Iter 3: threshold=19.895, num_xgrams=392239 [vs. intermediate_target=464256]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=1.000, num_xgrams=3783774 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=7.542, num_xgrams=1654208 [vs. intermediate_target=945943] PruneSizeModel: Iter 3: threshold=39.664, num_xgrams=388978 [vs. intermediate_target=463733]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=1.500, num_xgrams=3781468 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=11.310, num_xgrams=1653113 [vs. intermediate_target=945367] PruneSizeModel: Iter 3: threshold=59.442, num_xgrams=387952 [vs. intermediate_target=463579]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=2.500, num_xgrams=3779375 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=18.843, num_xgrams=1652204 [vs. intermediate_target=944843] PruneSizeModel: Iter 3: threshold=98.995, num_xgrams=387181 [vs. intermediate_target=463452]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=3.500, num_xgrams=3778248 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=26.376, num_xgrams=1651831 [vs. intermediate_target=944562] PruneSizeModel: Iter 3: threshold=138.548, num_xgrams=386877 [vs. intermediate_target=463400]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=4.500, num_xgrams=3777633 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=33.909, num_xgrams=1651673 [vs. intermediate_target=944408] PruneSizeModel: Iter 3: threshold=178.110, num_xgrams=386725 [vs. intermediate_target=463377]

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767 PruneSizeModel: Iter 1: threshold=5.000, num_xgrams=3777391 [vs. intermediate_target=0] PruneSizeModel: Iter 2: threshold=37.676, num_xgrams=1651605 [vs. intermediate_target=944347] PruneSizeModel: Iter 3: threshold=197.888, num_xgrams=386660 [vs. intermediate_target=463368]

It is hard to predict next_num_xgrams with threshold and prev_num_xgrams. Any solution?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/45#issuecomment-239380201, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu1GuUisd15g2GVbqsmQwP_ceslqJks5qfCF6gaJpZM4JfxL9 .

wantee commented 8 years ago

I did many tests these days, but can't find a set of parameters working well on all models. I often encounter such a situation:

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767
PruneSizeModel: Iter 1: threshold=0.250, num_xgrams=3796942 [vs. intermediate_target=0]
PruneSizeModel: Iter 2: threshold=0.841, num_xgrams=1679931 [vs. intermediate_target=949235]
PruneSizeModel: Iter 3: threshold=1.732, num_xgrams=443029 [vs. intermediate_target=621599]
PruneSizeModel: Iter 4: threshold=1.732, num_xgrams=126822 [vs. intermediate_target=319213]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 5: threshold=1.121, num_xgrams=467478 [vs. intermediate_target=621599]
PruneSizeModel: Iter 6: threshold=1.121, num_xgrams=174346 [vs. intermediate_target=327903]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 7: threshold=0.980, num_xgrams=477142 [vs. intermediate_target=621599]
PruneSizeModel: Iter 8: threshold=0.980, num_xgrams=193547 [vs. intermediate_target=331275]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 9: threshold=0.920, num_xgrams=482047 [vs. intermediate_target=621599]
PruneSizeModel: Iter 10: threshold=0.920, num_xgrams=203109 [vs. intermediate_target=332973]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 11: threshold=0.888, num_xgrams=484734 [vs. intermediate_target=621599]
PruneSizeModel: Iter 12: threshold=0.888, num_xgrams=208708 [vs. intermediate_target=333900]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 13: threshold=0.870, num_xgrams=486409 [vs. intermediate_target=621599]
PruneSizeModel: Iter 14: threshold=0.870, num_xgrams=211983 [vs. intermediate_target=334476]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 15: threshold=0.860, num_xgrams=487418 [vs. intermediate_target=621599]
PruneSizeModel: Iter 16: threshold=0.860, num_xgrams=213984 [vs. intermediate_target=334823]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 17: threshold=0.853, num_xgrams=488088 [vs. intermediate_target=621599]
PruneSizeModel: Iter 18: threshold=0.853, num_xgrams=215235 [vs. intermediate_target=335053]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 19: threshold=0.849, num_xgrams=488496 [vs. intermediate_target=621599]
PruneSizeModel: Iter 20: threshold=0.849, num_xgrams=216076 [vs. intermediate_target=335193]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 21: threshold=0.846, num_xgrams=488786 [vs. intermediate_target=621599]
PruneSizeModel: Iter 22: threshold=0.846, num_xgrams=216630 [vs. intermediate_target=335292]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 23: threshold=0.844, num_xgrams=488937 [vs. intermediate_target=621599]
PruneSizeModel: Iter 24: threshold=0.844, num_xgrams=216961 [vs. intermediate_target=335344]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 25: threshold=0.843, num_xgrams=489046 [vs. intermediate_target=621599]
PruneSizeModel: Iter 26: threshold=0.843, num_xgrams=217178 [vs. intermediate_target=335381]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 27: threshold=0.843, num_xgrams=489110 [vs. intermediate_target=621599]
PruneSizeModel: Iter 28: threshold=0.843, num_xgrams=217307 [vs. intermediate_target=335403]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 29: threshold=0.842, num_xgrams=489162 [vs. intermediate_target=621599]
PruneSizeModel: Iter 30: threshold=0.842, num_xgrams=217390 [vs. intermediate_target=335421]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 31: threshold=0.842, num_xgrams=489188 [vs. intermediate_target=621599]
PruneSizeModel: Iter 32: threshold=0.842, num_xgrams=217445 [vs. intermediate_target=335430]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 33: threshold=0.842, num_xgrams=489203 [vs. intermediate_target=621599]
PruneSizeModel: Iter 34: threshold=0.842, num_xgrams=217476 [vs. intermediate_target=335435]
PruneSizeModel: Backtrack to iter 2
PruneSizeModel: Iter 35: threshold=0.841, num_xgrams=489224 [vs. intermediate_target=621599]
PruneSizeModel: Iter 36: threshold=0.841, num_xgrams=217518 [vs. intermediate_target=335443]

It just like getting stuck in an infinite loop. I think the solution here is to let it backtrack to iter 1. I tried to adjust the backtrack mechanism, but failed. Do you have any suggestion?

wantee commented 8 years ago

After adjusting the prev_change_power, it converged much faster than before!

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767
PruneSizeModel: Iter 1: threshold=0.250, num_xgrams=3796942 [vs. intermediate_target=0]
PruneSizeModel: Iter 2: threshold=0.772, num_xgrams=1682526 [vs. intermediate_target=949235]
PruneSizeModel: Iter 3: threshold=1.389, num_xgrams=454390 [vs. intermediate_target=622079]
PruneSizeModel: Iter 4: threshold=1.389, num_xgrams=148462 [vs. intermediate_target=323280]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-1.2, prev_change_power=0.6
PruneSizeModel: Iter 5: threshold=1.177, num_xgrams=464558 [vs. intermediate_target=622079]
PruneSizeModel: Iter 6: threshold=1.177, num_xgrams=168175 [vs. intermediate_target=326877]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-1.44, prev_change_power=0.72
PruneSizeModel: Iter 7: threshold=1.025, num_xgrams=474046 [vs. intermediate_target=622079]
PruneSizeModel: Iter 8: threshold=1.025, num_xgrams=186924 [vs. intermediate_target=330198]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-1.728, prev_change_power=0.864
PruneSizeModel: Iter 9: threshold=0.914, num_xgrams=482782 [vs. intermediate_target=622079]
PruneSizeModel: Iter 10: threshold=0.914, num_xgrams=204184 [vs. intermediate_target=333227]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-2.0736, prev_change_power=1.0
PruneSizeModel: Iter 11: threshold=0.842, num_xgrams=489507 [vs. intermediate_target=622079]
PruneSizeModel: Iter 12: threshold=0.842, num_xgrams=217593 [vs. intermediate_target=335540]
wantee commented 8 years ago

After i added self.max_threshold_change_factor and change some parameters, sometimes it could converge very slow again.

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767
PruneSizeModel: Iter 1: threshold=0.250, num_xgrams=3796942 [vs. modeled_next_num_xgrams=0, intermediate_target=0]
PruneSizeModel: Iter 2: threshold=1.000, num_xgrams=1675370 [vs. modeled_next_num_xgrams=1072121, intermediate_target=949235]
PruneSizeModel: Iter 3: threshold=2.545, num_xgrams=427284 [vs. modeled_next_num_xgrams=620772, intermediate_target=620755]
PruneSizeModel: Iter 4: threshold=2.545, num_xgrams=96274 [vs. modeled_next_num_xgrams=313489, intermediate_target=313489]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-0.96, prev_change_power=0.36
PruneSizeModel: Iter 5: threshold=2.070, num_xgrams=434964 [vs. modeled_next_num_xgrams=620758, intermediate_target=620755]
PruneSizeModel: Iter 6: threshold=2.070, num_xgrams=111855 [vs. modeled_next_num_xgrams=316294, intermediate_target=316294]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-1.152, prev_change_power=0.432
PruneSizeModel: Iter 7: threshold=1.742, num_xgrams=442250 [vs. modeled_next_num_xgrams=620783, intermediate_target=620755]
PruneSizeModel: Iter 8: threshold=1.742, num_xgrams=126191 [vs. modeled_next_num_xgrams=318932, intermediate_target=318932]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-1.3824, prev_change_power=0.5184
PruneSizeModel: Iter 9: threshold=1.509, num_xgrams=449143 [vs. modeled_next_num_xgrams=620798, intermediate_target=620755]
PruneSizeModel: Iter 10: threshold=1.509, num_xgrams=139666 [vs. modeled_next_num_xgrams=321408, intermediate_target=321408]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-1.65888, prev_change_power=0.62208
PruneSizeModel: Iter 11: threshold=1.339, num_xgrams=455818 [vs. modeled_next_num_xgrams=620788, intermediate_target=620755]
PruneSizeModel: Iter 12: threshold=1.339, num_xgrams=152439 [vs. modeled_next_num_xgrams=323787, intermediate_target=323787]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-1.990656, prev_change_power=0.746496
PruneSizeModel: Iter 13: threshold=1.212, num_xgrams=461844 [vs. modeled_next_num_xgrams=620782, intermediate_target=620755]
PruneSizeModel: Iter 14: threshold=1.212, num_xgrams=164323 [vs. modeled_next_num_xgrams=325921, intermediate_target=325921]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-2.3887872, prev_change_power=0.8957952
PruneSizeModel: Iter 15: threshold=1.115, num_xgrams=467329 [vs. modeled_next_num_xgrams=620838, intermediate_target=620755]
PruneSizeModel: Iter 16: threshold=1.115, num_xgrams=174885 [vs. modeled_next_num_xgrams=327850, intermediate_target=327850]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-2.86654464, prev_change_power=1.0
PruneSizeModel: Iter 17: threshold=1.063, num_xgrams=470658 [vs. modeled_next_num_xgrams=620807, intermediate_target=620755]
PruneSizeModel: Iter 18: threshold=1.063, num_xgrams=181415 [vs. modeled_next_num_xgrams=329016, intermediate_target=329016]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-3.439853568, prev_change_power=1.0
PruneSizeModel: Iter 19: threshold=1.052, num_xgrams=471335 [vs. modeled_next_num_xgrams=620760, intermediate_target=620755]
PruneSizeModel: Iter 20: threshold=1.052, num_xgrams=182837 [vs. modeled_next_num_xgrams=329252, intermediate_target=329252]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-4.1278242816, prev_change_power=1.0
PruneSizeModel: Iter 21: threshold=1.043, num_xgrams=471925 [vs. modeled_next_num_xgrams=620793, intermediate_target=620755]
PruneSizeModel: Iter 22: threshold=1.043, num_xgrams=184058 [vs. modeled_next_num_xgrams=329458, intermediate_target=329458]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-4.95338913792, prev_change_power=1.0
PruneSizeModel: Iter 23: threshold=1.036, num_xgrams=472438 [vs. modeled_next_num_xgrams=620777, intermediate_target=620755]
PruneSizeModel: Iter 24: threshold=1.036, num_xgrams=185083 [vs. modeled_next_num_xgrams=329638, intermediate_target=329638]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-5.9440669655, prev_change_power=1.0
PruneSizeModel: Iter 25: threshold=1.030, num_xgrams=472875 [vs. modeled_next_num_xgrams=621008, intermediate_target=620755]
PruneSizeModel: Iter 26: threshold=1.030, num_xgrams=185954 [vs. modeled_next_num_xgrams=329790, intermediate_target=329790]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-7.1328803586, prev_change_power=1.0
PruneSizeModel: Iter 27: threshold=1.025, num_xgrams=473258 [vs. modeled_next_num_xgrams=621075, intermediate_target=620755]
PruneSizeModel: Iter 28: threshold=1.025, num_xgrams=186724 [vs. modeled_next_num_xgrams=329923, intermediate_target=329923]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-8.55945643033, prev_change_power=1.0
PruneSizeModel: Iter 29: threshold=1.021, num_xgrams=473527 [vs. modeled_next_num_xgrams=620856, intermediate_target=620755]
PruneSizeModel: Iter 30: threshold=1.021, num_xgrams=187304 [vs. modeled_next_num_xgrams=330017, intermediate_target=330017]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-10.2713477164, prev_change_power=1.0
PruneSizeModel: Iter 31: threshold=1.017, num_xgrams=473791 [vs. modeled_next_num_xgrams=620959, intermediate_target=620755]
PruneSizeModel: Iter 32: threshold=1.017, num_xgrams=187834 [vs. modeled_next_num_xgrams=330109, intermediate_target=330109]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-12.3256172597, prev_change_power=1.0
PruneSizeModel: Iter 33: threshold=1.014, num_xgrams=473978 [vs. modeled_next_num_xgrams=621383, intermediate_target=620755]
PruneSizeModel: Iter 34: threshold=1.014, num_xgrams=188238 [vs. modeled_next_num_xgrams=330174, intermediate_target=330174]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-14.7907407116, prev_change_power=1.0
PruneSizeModel: Iter 35: threshold=1.012, num_xgrams=474147 [vs. modeled_next_num_xgrams=621395, intermediate_target=620755]
PruneSizeModel: Iter 36: threshold=1.012, num_xgrams=188567 [vs. modeled_next_num_xgrams=330233, intermediate_target=330233]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-17.7488888539, prev_change_power=1.0
PruneSizeModel: Iter 37: threshold=1.010, num_xgrams=474282 [vs. modeled_next_num_xgrams=620790, intermediate_target=620755]
PruneSizeModel: Iter 38: threshold=1.010, num_xgrams=188835 [vs. modeled_next_num_xgrams=330280, intermediate_target=330280]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-21.2986666247, prev_change_power=1.0
PruneSizeModel: Iter 39: threshold=1.008, num_xgrams=474415 [vs. modeled_next_num_xgrams=621903, intermediate_target=620755]
PruneSizeModel: Iter 40: threshold=1.008, num_xgrams=189083 [vs. modeled_next_num_xgrams=330326, intermediate_target=330326]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-25.5583999496, prev_change_power=1.0
PruneSizeModel: Iter 41: threshold=1.007, num_xgrams=474522 [vs. modeled_next_num_xgrams=622071, intermediate_target=620755]
PruneSizeModel: Iter 42: threshold=1.007, num_xgrams=189303 [vs. modeled_next_num_xgrams=330364, intermediate_target=330364]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-30.6700799396, prev_change_power=1.0
PruneSizeModel: Iter 43: threshold=1.006, num_xgrams=474609 [vs. modeled_next_num_xgrams=621432, intermediate_target=620755]
PruneSizeModel: Iter 44: threshold=1.006, num_xgrams=189464 [vs. modeled_next_num_xgrams=330394, intermediate_target=330394]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-36.8040959275, prev_change_power=1.0
PruneSizeModel: Iter 45: threshold=1.005, num_xgrams=474676 [vs. modeled_next_num_xgrams=622772, intermediate_target=620755]
PruneSizeModel: Iter 46: threshold=1.005, num_xgrams=189591 [vs. modeled_next_num_xgrams=330417, intermediate_target=330417]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-44.164915113, prev_change_power=1.0
PruneSizeModel: Iter 47: threshold=1.004, num_xgrams=474730 [vs. modeled_next_num_xgrams=621478, intermediate_target=620755]
PruneSizeModel: Iter 48: threshold=1.004, num_xgrams=189709 [vs. modeled_next_num_xgrams=330436, intermediate_target=330436]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-52.9978981356, prev_change_power=1.0
PruneSizeModel: Iter 49: threshold=1.003, num_xgrams=474778 [vs. modeled_next_num_xgrams=620942, intermediate_target=620755]
PruneSizeModel: Iter 50: threshold=1.003, num_xgrams=189803 [vs. modeled_next_num_xgrams=330453, intermediate_target=330453]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-63.5974777627, prev_change_power=1.0
PruneSizeModel: Iter 51: threshold=1.003, num_xgrams=474819 [vs. modeled_next_num_xgrams=620913, intermediate_target=620755]

Do we need to relax the limitation of prev_change_power?

wantee commented 8 years ago

Following log is produced by removing the 1.0 upper bound for prev_change_power with all other parameters are same with last post.

PruneSizeModel: Iter 0: threshold=0.000, num_xgrams=6376767
PruneSizeModel: Iter 1: threshold=0.250, num_xgrams=3796942 [vs. modeled_next_num_xgrams=0, intermediate_target=0]
PruneSizeModel: Iter 2: threshold=1.000, num_xgrams=1675370 [vs. modeled_next_num_xgrams=1072121, intermediate_target=949235]
PruneSizeModel: Iter 3: threshold=2.545, num_xgrams=427284 [vs. modeled_next_num_xgrams=620772, intermediate_target=620755]
PruneSizeModel: Iter 4: threshold=2.545, num_xgrams=96274 [vs. modeled_next_num_xgrams=313489, intermediate_target=313489]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-0.96, prev_change_power=0.36
PruneSizeModel: Iter 5: threshold=2.070, num_xgrams=434964 [vs. modeled_next_num_xgrams=620758, intermediate_target=620755]
PruneSizeModel: Iter 6: threshold=2.070, num_xgrams=111855 [vs. modeled_next_num_xgrams=316294, intermediate_target=316294]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-1.152, prev_change_power=0.432
PruneSizeModel: Iter 7: threshold=1.742, num_xgrams=442250 [vs. modeled_next_num_xgrams=620783, intermediate_target=620755]
PruneSizeModel: Iter 8: threshold=1.742, num_xgrams=126191 [vs. modeled_next_num_xgrams=318932, intermediate_target=318932]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-1.3824, prev_change_power=0.5184
PruneSizeModel: Iter 9: threshold=1.509, num_xgrams=449143 [vs. modeled_next_num_xgrams=620798, intermediate_target=620755]
PruneSizeModel: Iter 10: threshold=1.509, num_xgrams=139666 [vs. modeled_next_num_xgrams=321408, intermediate_target=321408]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-1.65888, prev_change_power=0.62208
PruneSizeModel: Iter 11: threshold=1.339, num_xgrams=455818 [vs. modeled_next_num_xgrams=620788, intermediate_target=620755]
PruneSizeModel: Iter 12: threshold=1.339, num_xgrams=152439 [vs. modeled_next_num_xgrams=323787, intermediate_target=323787]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-1.990656, prev_change_power=0.746496
PruneSizeModel: Iter 13: threshold=1.212, num_xgrams=461844 [vs. modeled_next_num_xgrams=620782, intermediate_target=620755]
PruneSizeModel: Iter 14: threshold=1.212, num_xgrams=164323 [vs. modeled_next_num_xgrams=325921, intermediate_target=325921]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-2.3887872, prev_change_power=0.8957952
PruneSizeModel: Iter 15: threshold=1.115, num_xgrams=467329 [vs. modeled_next_num_xgrams=620838, intermediate_target=620755]
PruneSizeModel: Iter 16: threshold=1.115, num_xgrams=174885 [vs. modeled_next_num_xgrams=327850, intermediate_target=327850]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-2.86654464, prev_change_power=1.07495424
PruneSizeModel: Iter 17: threshold=1.040, num_xgrams=472115 [vs. modeled_next_num_xgrams=620852, intermediate_target=620755]
PruneSizeModel: Iter 18: threshold=1.040, num_xgrams=184466 [vs. modeled_next_num_xgrams=329525, intermediate_target=329525]
PruneSizeModel: Backtrack to iter: 2, xgrams_change_power=-3.439853568, prev_change_power=1.289945088
PruneSizeModel: Iter 19: threshold=1.000, num_xgrams=475024 [vs. modeled_next_num_xgrams=620755, intermediate_target=620755]
PruneSizeModel: Iter 20: threshold=1.000, num_xgrams=190313 [vs. modeled_next_num_xgrams=330538, intermediate_target=330538]
PruneSizeModel: Backtrack to iter: 1, xgrams_change_power=-4.1278242816, prev_change_power=1.5479341056
PruneSizeModel: Iter 21: threshold=0.288, num_xgrams=1729129 [vs. modeled_next_num_xgrams=949299, intermediate_target=949235]
PruneSizeModel: Iter 22: threshold=0.288, num_xgrams=653202 [vs. modeled_next_num_xgrams=630635, intermediate_target=630635]
PruneSizeModel: Iter 23: threshold=0.288, num_xgrams=522828 [vs. modeled_next_num_xgrams=387604, intermediate_target=387604]
PruneSizeModel: Iter 24: threshold=0.293, num_xgrams=509862 [vs. modeled_next_num_xgrams=346894, intermediate_target=346772]
PruneSizeModel: Iter 25: threshold=0.319, num_xgrams=472758 [vs. modeled_next_num_xgrams=342556, intermediate_target=342445]
PruneSizeModel: Iter 26: threshold=0.339, num_xgrams=443892 [vs. modeled_next_num_xgrams=329765, intermediate_target=329749]
PruneSizeModel: Iter 27: threshold=0.358, num_xgrams=421931 [vs. modeled_next_num_xgrams=319559, intermediate_target=319524]
PruneSizeModel: Iter 28: threshold=0.378, num_xgrams=396844 [vs. modeled_next_num_xgrams=311569, intermediate_target=311519]
PruneSizeModel: Iter 29: threshold=0.395, num_xgrams=380831 [vs. modeled_next_num_xgrams=302212, intermediate_target=302116]
PruneSizeModel: Iter 30: threshold=0.413, num_xgrams=365176 [vs. modeled_next_num_xgrams=295961, intermediate_target=295958]
PruneSizeModel: Iter 31: threshold=0.430, num_xgrams=351830 [vs. modeled_next_num_xgrams=289883, intermediate_target=289811]
PruneSizeModel: Iter 32: threshold=0.446, num_xgrams=339534 [vs. modeled_next_num_xgrams=284536, intermediate_target=284466]
PruneSizeModel: Iter 33: threshold=0.469, num_xgrams=325012 [vs. modeled_next_num_xgrams=261983, intermediate_target=261955]
PruneSizeModel: Iter 34: threshold=0.488, num_xgrams=309461 [vs. modeled_next_num_xgrams=258244, intermediate_target=258158]
PruneSizeModel: Iter 35: threshold=0.502, num_xgrams=299869 [vs. modeled_next_num_xgrams=254052, intermediate_target=253965]
PruneSizeModel: Iter 36: threshold=0.518, num_xgrams=291636 [vs. modeled_next_num_xgrams=251382, intermediate_target=251308]
PruneSizeModel: Iter 37: threshold=0.533, num_xgrams=284547 [vs. modeled_next_num_xgrams=249043, intermediate_target=248982]
PruneSizeModel: Iter 38: threshold=0.546, num_xgrams=277904 [vs. modeled_next_num_xgrams=247010, intermediate_target=246944]
PruneSizeModel: Iter 39: threshold=0.558, num_xgrams=272634 [vs. modeled_next_num_xgrams=245025, intermediate_target=245003]
PruneSizeModel: Iter 40: threshold=0.570, num_xgrams=267758 [vs. modeled_next_num_xgrams=243488, intermediate_target=243442]
PruneSizeModel: Iter 41: threshold=0.580, num_xgrams=263776 [vs. modeled_next_num_xgrams=242040, intermediate_target=241979]
PruneSizeModel: Iter 42: threshold=0.596, num_xgrams=257986 [vs. modeled_next_num_xgrams=230033, intermediate_target=230001]
PruneSizeModel: Iter 43: threshold=0.608, num_xgrams=253621 [vs. modeled_next_num_xgrams=230001, intermediate_target=230001]
PruneSizeModel: Iter 44: threshold=0.618, num_xgrams=249879 [vs. modeled_next_num_xgrams=230037, intermediate_target=230001]
PruneSizeModel: Iter 45: threshold=0.627, num_xgrams=246430 [vs. modeled_next_num_xgrams=230053, intermediate_target=230001]
PruneSizeModel: Iter 46: threshold=0.635, num_xgrams=243949 [vs. modeled_next_num_xgrams=230032, intermediate_target=230001]
PruneSizeModel: Iter 47: threshold=0.641, num_xgrams=242016 [vs. modeled_next_num_xgrams=230082, intermediate_target=230001]

I paste this as an example to illustrate my idea to make the model parameters as local variables for every iteration instead of global variables. As it can be seen, we backtrack to iter 2 for 8 times, and then backtrack to iter 1. At this point, the model is adjusted to be much less aggressive, and lead to a very slow procedure to converge from iter 1.

It should not be big problem if we never backtrack too much times. But I think it is hard to guarantee that our hard-coded parameters would work well on all models.

danpovey commented 8 years ago

@wantee, am I right that you are still working on this and it's not ready for review? The mechanism I mentioned, where once you backtrack you always repeat the previous threshold (unless it's been repeated before), should I think make it converge much faster.

wantee commented 8 years ago

Yes, I'm still working on this. I haven't tried the mechanism because that I couldn't access to the code in my private branch during the weekend. I will try it and later today.

wantee commented 8 years ago

Hi Dan, After I added the mechanism you proposed last time, it can converge must faster. I tried six models (the default 3- ,4- and 5-gram plus 3 other models I trained with half of the training data) with some different target size in swbd, they can converge in less than 10 iterations. If this is acceptable, I think you can review this PR now.

danpovey commented 8 years ago

Yes that sounds good. I'll review it now.

On Sun, Aug 21, 2016 at 10:57 PM, Wang Jian notifications@github.com wrote:

Hi Dan, After I added the mechanism you proposed last time, it can converge must faster. I tried six models (the default 3- ,4- and 5-gram plus 3 other models I trained with half of the training data) with some different target size in swbd, they can converge in less than 10 iterations. If this is acceptable, I think you can review this PR now.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/45#issuecomment-241310187, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu00up9Ig5NEBsSpOW7sQcHi-pFdcks5qiR4xgaJpZM4JfxL9 .

danpovey commented 8 years ago

Actually I don't think I have time to look at the code in detail right now, but I think it's clear that it's better than what was there before, so I'll merge it and if I look at the code later, I may make detailed comments then.