Open dy1998 opened 2 years ago
Hello, I wonder why the output and soft_feat_aug should be divided by args.temp when compute ce loss?
Hello, I wonder why the output and soft_feat_aug should be divided by args.temp when compute ce loss?