-
## Environment info
- `transformers` version: 4.6.1
- Platform: Linux-5.4.109+-x86_64-with-Ubuntu-18.04-bionic
- Python version: 3.7.10
- PyTorch version (GPU?): 1.8.1+cu101 (False)
- Tensorflo…
-
![image](https://user-images.githubusercontent.com/30308731/99517286-253fc180-29ca-11eb-9619-9438605c9ec9.png)
论文描述中,这个部分是LConv,有点不解,望不吝解答,感谢
focox updated
3 years ago
-
## Environment info
- `transformers` version: 4.3.2
- Platform: Linux-5.9.16-1-MANJARO-x86_64-with-glibc2.10
- Python version: 3.8.5
- PyTorch version (GPU?): 1.7.1 (True)
- Tensorflow versio…
-
我从readme 里下载了你的预训练模型 convbert_base convbert_medium convbert_small. 这三个模型 文件夹里没有词表, 我根据你项目中的词表 vocab.txt (30522维度) 我理解你这是英文的预训练模型,请问我理解的对吗(我是根据electra 来看 英文模型词表是30522 中文预训练模型 词表是 21128)。谢谢回答
-
LSRA: Lite Transformer with Long-Short Range Attention.
LSRA also integrates convolution operations into transformer blocks. I'm just wondering what makes ConvBert differ from LSRA.
-
首先感谢你用tf2.x实现了ConvBert代码,我之前也简单的学习过tf2.x的代码,其次我想问一下,为啥我感觉误差有点大。
small模型输出的最大误差 1.169890e-04
base模型输出的最大误差 2.193451e-04
我感觉同种框架下的误差应该不是-4次方级别的,可能实现的时候与原版有点差异。
-
When running
```
python run.py \
--train_data_path data_utils/dialoglue/hwu/train.csv \
--val_data_path data_utils/dialoglue/hwu/val.csv \
--test_data_path data_utils/dial…
-
```
root@iZ8vb4fe2fj7uy9gqp0tihZ:/data/ConvBert# python3 modify_onnx_gs.py
electra/embeddings_1/one_hot (onehotEncoder)
Inputs: [Variable (electra/embeddings_1/Reshape:0): (shape=None, dtype=None)…
-
As mentioned in the README:
> To train/evaluate the model using our modifications (i.e., MLM pre-training), you can use trippy/DO.example.advanced.
But `trippy/DO.example.advanced` use BERT inst…
-
I have several questions:
1. the original performance of trippy is 55.3% on multiwoz 2.1(in paper). Your bert-base DST achieves 56.3%. So where does the improvement comes from? I notice that the or…