redpony / cdec

Decoder, aligner, and model optimizer for statistical machine translation and other structured prediction models based on (mostly) context-free formalisms
http://cdec-decoder.org/
Apache License 2.0
183 stars 77 forks source link

mira: exit of parallelize.pl causes weight averaging to run on incomplete models if kbest_cut_mira is still saving weights to disk #39

Closed fhieber closed 10 years ago

fhieber commented 10 years ago

I discovered a minor issue when running mira.py with lots of features on a server with heavy load: the problem occurs when kbest_cut_mira instances write large models with lots of features to disk after they have finished processing their stdin (and have returned translations). The writing of pass.X/weights.X files is not finished yet, but parallelize.pl already exits, because sentserver has received as many lines as sent out to kbest_cut_mira clients. If training with lots of sparse features, the call to 'Weights::ShowLargestFeatures(dense_weights);' in kbest_cut_mira.cc is costly and delays model saving. Thus, the execution of mira.py continues and the average_weights() method is called on a list of weight files not fully written to disk yet. The result is that sometimes the final model of that iteration is missing weights.

redpony commented 10 years ago

I've pushed a fix for this. Sorry for the confusion! Also, I identified a number of bugs in this mira code (some of witch are fairly subtle since they are due to uninitialized variables).