-
## Environment info
- `transformers` version: 3.5.0
- Platform: Linux
- Python version: 3.7
- PyTorch version (GPU?):
- Tensorflow version (GPU?): 2.3.1
- Using GPU in script?: Yes
- U…
-
## Environment info
- `transformers` version: 4.1.1 (stable)
- Platform: Google Colab
- Python version: 3.6.9
- PyTorch version (GPU?): 1.7.0+cu101
- Using GPU in script?: yes
- Using distri…
-
tensorflow 1.15 has a function "tf.contrib.seq2seq.AttentionWrapper", but as we know tf.contrib module is no longer available in TensorFlow 2.0. I find "tfa.seq2seq.AttentionWrapper" may replace "tf.c…
-
老师好,
我尝试加载 gmm attention, 修改了 tacotron_gmm.py , 头部改成如下,但无法运行。请教解决方案。谢谢!
import tensorflow as tf
from tacotron.utils.symbols import symbols
from tacotron.utils.infolog import log
from tacotron.…
-
- `transformers` version: 4.1.1
- mBART: @patrickvonplaten
## Information
Model I am using (Bert, XLNet ...): mBART
The problem arises when using:
* [x] the official example scripts: (give deta…
-
我需要训练一个中文地址数据纠错模型,有如下几点疑问:
1. 对于中文地址的纠错,其用jieba分词错误率可能较高,是否可以直接使用字符级分割方法?
2. 如果使用字符级分割方法,数据集的格式是怎样的?(要和说明中一致,先分词,分词之间用空格隔离。错误数据和正确数据之间使用tab间隔?),要如何准备? 一般用的标注工具是什么?
3. readme中提到的“conv_seq2seq、seq2s…
-
# 🚀 Feature request
Please could we have the ability to return attention weights from the decoded generated tokens to the encoded source?
## Motivation
To attribute the decoded text. E.g in t…
-
### Question
I'm trying to fine tune seq2seq model using fork command and I got this message error
target contains elements out of valid range [0, num_categories) in categorical cross entropy
####…
-
So, I tried to follow the procedure you described here for a Bengali Dataset:
https://github.com/pkouris/abtextsum/issues/3
I have tried to follow this in 3 different machines (because I thought i…
-
I'm training Longformer2Roberta, the encoder part of this Seq2Seq model is Longformer. The one feature Longformer brings is global attention, I found the use of it during training, but it is never use…