Below self.temperature is used in two different manners. While logits are divided by it, targets are multiplied by this value. Is this intentional or is it a mathematical mistake?
Yes this should be a mistake. Good catch. I have now fixed the code. Luckily this shouldn't influence results since temperature was set to 1 for all experiments. Thanks for this.
Hi,
Below self.temperature is used in two different manners. While logits are divided by it, targets are multiplied by this value. Is this intentional or is it a mathematical mistake?
Thanks.