hiyoung123 / SoftMaskedBert

Soft-Masked Bert 复现论文:https://arxiv.org/pdf/2005.07421.pdf
255 stars 47 forks source link

为什么在backward的时候要把retain_graph设为true呢,不会显存爆炸么 #15

Open pmouren opened 4 years ago