Open yuxiaoleipeng opened 5 years ago
I met this problem, too. Does it relate to reproducibility?
As I've explained here the loss seems to plateau because of the reconstruction term having reached the perfect means and the only thing which is optimized afterwards is the KL term which is, however, of a different order of magnitude. I admit that this loss setup is suboptimal and a bit confusing, so I'm happy about any suggestions.
when i train the model, it can run, but when the step meet 200000 loss drop to 6900 and it never go down, i train it for 370000 step and it still 6900, do you meet this question before?