Open vanechu opened 8 years ago
Looks good!
I think this line is undesirable:
self.probs = tf.nn.softmax(tf.div(self.logits, temperature))
This overwrites self.probs which looked to be used later (when using the network as a language model) meaning that it impossible to do generation and get probs at the same time without trickery. It may make sense to make a new placeholder variable in the model init for temperature which can be fed, and a new probs with temp which can be retrieved (1). Or it may make sense to the the temperature sampling outside of the tensorflow graph (2).
(1)
# add new placeholder
def __init__(...):
...
self.temperature = tf.placeholder_with_default(tf.constant(1, dtype=tf.float32), None)
self.temp_probs = tf.nn.softmax(tf.div(self.logits, self.temperature))
...
def sample(...):
# same as this PR but use self.temp_probs when appropriate
(2)
# do temp sampling outside tf graph
def __init__(...):
# same as before
def sample(...):
...
# when appropriate, run to get self.logits
logits, state = sess.run([self.logits, self.final_state], feed)
logits = logits[0]
if temperature == 0.0:
sample = np.argmax(logits)
else:
scale = logits / temperature
exp = np.exp(scale - np.max(scale))
soft = exp / np.sum(exp)
sample = np.random.choice(len(soft), p=soft)
@vanechu This PR has merge conflicts.
@fujimotomh good implementation
Implement temperature #22 with default value as 1. Following description is cited from karpathy/char-rnn .