Open lddsdu opened 3 years ago
It seems that the parameter initialized with randn (https://github.com/graykode/xlnet-Pytorch/blob/cb793a1c75bdc59e3360f04ec641af726719811f/xlnet.py#L119) will lead to low-performance, and I tried xavier_norm and kaiming_uniform, both reach a much higher AUC and F1 score in my task.
It seems that the parameter initialized with randn (https://github.com/graykode/xlnet-Pytorch/blob/cb793a1c75bdc59e3360f04ec641af726719811f/xlnet.py#L119) will lead to low-performance, and I tried xavier_norm and kaiming_uniform, both reach a much higher AUC and F1 score in my task.