Closed Henry-E closed 6 years ago
you could simply do a random sampling according to the output distribution from the generator.
Thanks for the advice. @sebastianGehrmann also recommended implementing temperature at this line and sampling rather than taking the topk. https://github.com/OpenNMT/OpenNMT-py/blob/9f4b4d77caf553d317af9b7bba4e7a9634253bf0/onmt/Beam.py#L80 The temperature code from Karpathy's charRNN was simple enough to implement https://github.com/karpathy/char-rnn/blob/6f9487a6fe5b420b7ca9afb0d7c078e37c1d1b4e/sample.lua#L145-L149
It's not ready for a pull request or anything yet but if anyone is interested this is the code
if self.temperature:
prediction = flatBeamLk.div(self.temperature)
probs = prediction.exp()
norm_probs = probs.div(probs.sum())
bestScoresId = norm_probs.multinomial(self.size)
# TODO try the temperature weighted scores instead
bestScores = flatBeamLk[bestScoresId]
else:
bestScores, bestScoresId = flatBeamLk.topk(self.size, 0, True, True)
@Henry-E and @sebastianGehrmann : i tried implementing temperature using different values, but there is no variation in the output . this is where I made the change in OpenNMT-py/onmt/translate/Beam.py:
if self.temperature:
prediction = flat_beam_scores.div(self.temperature)
probs = prediction.exp()
norm_probs = probs.div(probs.sum())
bestScoresId = norm_probs.multinomial(self.size)
bestScores = flat_beam_scores[bestScoresId]
else:
best_scores, best_scores_id = flat_beam_scores.topk(self.size, 0,
True, True)
self.all_scores.append(self.scores)
self.scores = best_scores
Am i missing out on any step ?
What temperature values did you try? I usually did between 0.2 and 1.5. It's possible that you've got a really really confident model. Aside from that the implementation you have looks the same. Have you tried stepping through it using a debugger to see what the probabilities for each of the topk are when using non-exponential probability.
@Henry-E Thank you for your response. I tried values as low as 0.1 to as high as 1.8. i still dont see any variations. I will run it through debugger.
Also, trained my model with parameters mentioned here: http://opennmt.net/OpenNMT-py/FAQ.html#how-do-i-use-the-transformer-model but i still dont see as good results. Any thoughts?
You should change the beam_search.py file, not beam.py. I have implemented the feature this way.
I will PR temperature feature for beam search when I have time.
Just looking for a little guidance if there's an easy to modify part of the code that I could use to generate more random sentences. Like by using temperature. Or adding randomness to the beam search.