-
Hi,
Do you plan to release the trained baseline models that you benchmarked in the paper?
Regards,
Daksh
-
### Description
When training a Transformer-xl (model: transformer_memory, hyperparameters:transformer_wikitext103_l4k_memory_v0), if the transformer encounters an unexpected batch size, it halts t…
-
In the paper, [Attention is All You Need](https://arxiv.org/pdf/1706.03762.pdf), query, key, value are linear transformed without bias at the multi-head attention.
However, the variables in your code…
-
If I understood correctly, at evaluation you run
`preds = np.zeros((hp.batch_size, hp.maxlen), np.int32)`
`for j in range(hp.maxlen):`
` _preds = sess.run(g.preds, {g.x: x, g.y: preds})`
` …
-
Can new release be pushed to https://pypi.org/ any time soon?
-
### Description
Followed steps exactly as given in official tutorial for running Transformer on Cloud TPU - https://cloud.google.com/tpu/docs/tutorials/transformer except using PROBLEM=translate_enfr…
-
I have exported serving model, as back end of restful server.
I have tested performance of Transformer and GNMT, with transoformer_big and beam_size 10.
Tensor2Tensor version is 1.4.4. Source tokens…
-
i installed t2t with command below:
`pip install tensor2tensor && t2t-trainer --generate_data --data_dir=~/t2t_data --output_dir=~/t2t_train/mnist --problem=image_mnist --model=shake_shake …
-
The last commit was made more than a year ago, pull requests are never merged, and no one answers to open issues. Is this project dead ? If so, it would be nice to signal it in the README and on the w…
-
Hello,
I have trained the model given in the walk-though for English-German translation. I was trying to access attention weights but I found there is no direct command or way to access it. So I tri…