-
Hi I was wondering why the maximum batch size is ~100 using a GPU with ~11GB of RAM whereas in the [tensor2tensor](https://github.com/tensorflow/tensor2tensor) the maximum batch size there is 1024?
-
Hi all,
I'm working on the languagemodel_wiki_noref_v8k_l1k problem.
the configuration file for datagen, train and decode is:
`PROBLEM=languagemodel_wiki_noref_v8k_l1k
MODEL=transformer
HPARA…
-
### Description
It seems that the fc layer of the moe type has not been implemented...
In tensor2tensor/layers/common_attention.py:289 , I can't find the the fc layer of the moe type.
``…
-
I started an experiment with `audio_preproc_in_bottom` set to `True` for a `SpeechRecognition` problem (LibriSpeech) using the transformer model.
However, I am noticing that the word error rate sta…
-
### Description
I cannot train a transformer_base model using bfloat type for both activation and weights using GPU (GTX 1080Ti).
From the error I got
```
ValueError: Tensor conversion reques…
-
### Description
I have trained a `text_cnn` model on a custom `Text2Class` problem without any issues using `t2t-trainer` (I've also continued training with `t2t-trainer` by restoring from a checkp…
-
I want to improve transformer model. I think tensor2tensor is too big to change, so I choose this code. First, I have to reproduce the previous result. I add beam search by myself(before adding the be…
trx14 updated
4 years ago
-
您好,为什么预测的的起始位置标志使用的是pad的id 0 ?还有l2r、r2l标志使用的id 2和3,和一部分词的id重复吗?您的vocab_file里面有l2r、r2l标志吗?
还有在计算encoder input的时候,为什么要把所有的embedding再加一个值?这里remove的功能没看明白。
def transformer_prepare_encoder(inputs, hpar…
-
### Description
I'm trying to use t2t-datagen to generate my own data following the example [here](https://github.com/tensorflow/tensor2tensor/blob/master/docs/new_problem.md).
I'm running the progr…
-
**Description
I** have been trying to train ASR on librispeech using transformer model and when I am trying to see the results using t2t-decoder, specifically using these commands
`t2t-decoder --d…