Sampling step? - Githubissues

google-research / electra

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

Apache License 2.0

2.31k stars 351 forks source link

Open anshulsamar opened 3 years ago

anshulsamar commented 3 years ago

Thanks for your documentation and transparency here

Quick question, in sample_from_softmax(logits, disallow=None), you return:

tf.one_hot(tf.argmax(tf.nn.softmax(logits + gumbel_noise), -1, output_type=tf.int32), logits.shape[-1])

Wondering why tf.softmax is needed here if the result will be passed through tf.argmax anyways? Perhaps this is a holdover from another experiment?

Thanks!

Acejoy commented 9 months ago

Hey, did you find the answer to this?

anshulsamar commented 9 months ago

Hi, I didn't, sorry