lonePatient / NeZha_Chinese_PyTorch

NEZHA: Neural Contextualized Representation for Chinese Language Understanding
MIT License
262 stars 54 forks source link

两个self.relative_positions_encoding[:to_seq_length, :to_seq_length, :].to(hidden_states.device)太影响性能了 #10

Open huangyc0618 opened 3 years ago

huangyc0618 commented 3 years ago

占用了大量CPU资源和时间,建议init初始化后就直接to device