Closed luofuli closed 5 years ago
Sorry for my late reply. For 1, we will release the version with this parameter in the next version. For 2, in the paper, notation h is not explicit explained. In their figures, it denotes the hidden layer; while in the equations, we think it may denote the output layer (as in your question). Since there is only a linear transformation between these two,
def create_output_unit(self, params):
self.Wo = tf.Variable(self.init_matrix([self.hidden_dim, self.num_vocabulary]))
self.bo = tf.Variable(self.init_matrix([self.num_vocabulary]))
params.extend([self.Wo, self.bo])
def unit(hidden_memory_tuple):
hidden_state, c_prev = tf.unstack(hidden_memory_tuple)
logits = tf.matmul(hidden_state, self.Wo) + self.bo
return logits
return unit
we think it won't have huge impacts on the final results. For 3, as we explained in our paper, this model cannot generate semantic sentences in the real data experiment. In the original paper, the authors also did not conduct experiments on natural languages.
Hello, I have a few questions about GSGAN.
In your code, the inverse temperature parameter
τ
(self.tau
in file: GsganGenerator.py) is kept to 10. However, in the original paper, the authors suggests that starting with some relatively largeτ
and then anealing it to zero during training.What's more, I also don't understand why you add gumbel distribution before calculating output logists.
Can you give me more illustration about this function?
Finally, why you don't show the performance of GSGAN?