graykode / xlnet-Pytorch

Simple XLNet implementation with Pytorch Wrapper
https://arxiv.org/pdf/1906.08237.pdf
Apache License 2.0
577 stars 107 forks source link

Parameter initialized with torch.randn may be not a good choice #17

Open lddsdu opened 3 years ago

lddsdu commented 3 years ago

It seems that the parameter initialized with randn (https://github.com/graykode/xlnet-Pytorch/blob/cb793a1c75bdc59e3360f04ec641af726719811f/xlnet.py#L119) will lead to low-performance, and I tried xavier_norm and kaiming_uniform, both reach a much higher AUC and F1 score in my task.