julianser / hed-dlg-truncated

Hierarchical Encoder Decoder RNN (HRED) with Truncated Backpropagation Through Time (Truncated BPTT)
GNU General Public License v3.0
308 stars 129 forks source link

patience = -1 and cost_mean is always about 3.5 #16

Closed WenjieWWJ closed 7 years ago

WenjieWWJ commented 7 years ago

I tried to train VHRED model using the Ubuntu Dataset. I have trained it on gpu for four days. but the cost_mean is always 3.5 . Today I found patience = -1 and " main: DEBUG: All done, exiting..." Can you give some help? thanks !

2017-07-19 01:45:53,546: main: DEBUG: [VALID] - Got batch 80,81 valid_cost 23692979.0231 valid_kl_divergence_cost sample 0.0 posterior_mean_variance 0.0 2017-07-19 01:45:53,782: main: DEBUG: [VALID] - Got batch 80,81 valid_cost 23694412.2425 valid_kl_divergence_cost sample 0.0 posterior_mean_variance 0.00710910232738 2017-07-19 01:45:54,011: main: DEBUG: [VALID] - Got batch 80,45 valid_cost 23695200.6772 valid_kl_divergence_cost sample 0.0 posterior_mean_variance 0.0 2017-07-19 01:45:54,147: main: DEBUG: [VALIDATION END] valid cost (NLL) = 18.9642, valid word-perplexity = 172211119.1154, valid kldiv cost (per word) = 0.00000000, valid mean posterior variance (per word) = 0.00017112, patience = -1

julianser commented 7 years ago

Please do not post issues which are not directly related to the Github repo itself.

It looks like your training has finished and that your model has reached a validation set perplexity at 18.96. This seems reasonable to me (for example, on Ubuntu our perplexities at convergence were in the range 30-40 depending on preprocessing).