-
## 論文リンク
https://arxiv.org/abs/1909.11764
https://openreview.net/forum?id=BygzbyHFvB
## 公開日(yyyy/mm/dd)
2019/09/25
ICLR2020
## 概要
-
FreeLB: Enhanced Adversarial Training for Language Understanding
Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, Jingjing Liu, ICLR 2020
- https://arxiv.org/abs/1909.11764
- [openreview (8-8…
-
Could you upload your dataset, please?
I can't run your code because I have no datasets.
I want to run experiments with AGNews and IMDb.
I tested your FreeLB++ code on IMDb by importing it via Hugg…
-
I found in the code of `search_ticket.py`, it uses freelb adversarial training. So, it need to loss.backward() more than once (`adv_steps `steps in total), but the code doesn't use the parameter retai…
-
```
class FreeAT(tf.keras.Model):
def train_step(self, data):
x, y = data
last_r = 0.0
last_r_slice = 0.0
K = 3
ep = 1e-3
for t in range(K…
-
http://fyubang.com/2019/10/15/adversarial-train/
本文分享一个“万物皆可盘”的NLP对抗训练实现,只需要四行代码即可调用。你值得拥有。 最近,微软的FreeLB-Roberta [1] 靠着对抗训练 (Adversarial Training) 在GLUE榜上超越了Facebook原生的Roberta,追一科技也用到了这个方法仅凭单模型 [2…
-
It's hard to understand the code, including bash shell.
-
作者您好,
with tf.control_dependencies([init_op]): # fix perturbation
# Scale randomly initialized permutation, to make sure norm
# of r is smaller than epsilon.
…
-
使用sitdown转知乎文章,公式转换为图片格式,导致复制到微信上特别小。
知乎文章链接:https://zhuanlan.zhihu.com/p/103593948
![image](https://user-images.githubusercontent.com/23072928/76674071-74855500-65e6-11ea-9819-b3444e5be0ee.pn…
-
您好~
在使用unif的过程中,对下面这个函数有点疑惑,您用空的时候看看哈~
如下函数求梯度的平均值时,如果grad是IndexedSlices类型的话,对value求平均,而indices则取第一个grad的indices;
感觉每个grad的indices是不一样的,假如是四卡的情况,一个batch被分成四分,其数据是不一样的,那取得应该是embedding_table矩阵的不同…