Thanks a lot for publicizing such a high quality work, I'm a bit confused about performing the softmax_with_temperature operation twice in the code shown in the figure. Why is it necessary to do it twice and with different temperature coefficients? Looking forward to your reply.
Thanks a lot for publicizing such a high quality work, I'm a bit confused about performing the softmax_with_temperature operation twice in the code shown in the figure. Why is it necessary to do it twice and with different temperature coefficients? Looking forward to your reply.