-
## 論文リンク
https://arxiv.org/abs/1508.07909
## 公開日(yyyy/mm/dd)
2015/08/31
## 概要
機械翻訳のモデルで rare word に対応するために、単語よりも小さい sub-word レベルを最小単位として扱うことを提案した論文。
rare word は例えば複合語などがあり、これは単語レベルで見ると確かに ra…
-
# BPE as input tokens of the transformer model
The transformer model proposed by "_Attention is all you need_" encodes the 4.5M sentence input data into a small vocabulary generated by learning sha…
-
I've noticed in several of my experiments that xnmt is generating "s" even when using sentencepiece, which is strange. This issue is a reminder to myself to check why this is happening and see if it's…
-
Neural Machine Translation of Rare Words with Subword Units
dropout
batch normalization
layer norm
-
### Feature Description
As today, the standard way to build a vocabulary of (usually subword) units for text inputs is to pretrain a model capable to generate a list of most adequate subwords units f…
-
link: http://repository.cmu.edu/lti/48/
referenced from:
- Neural Machine Translation of Rare Words with Subword Units #79 (for a bilingual dictionary based on fast-align)
-
# Requirements
- [ ] Read about input embeddings technique (byte-pair encoding) used by Google's team on "Attention Is All You Need" paper.
- [ ] Design the input embeddings pipeline for **wmt 2014 e…
-
Hi,
I am trying to use the subword units along Kaldi librispeech recipe. I have used the code snippet mentioned in the README in the stage 3 of librispeech recipe.
```
if [ $stage -le 3 ]; then
…
-
## 0. Paper
@inproceedings{xu-etal-2019-treat,
title = "Treat the Word As a Whole or Look Inside? Subword Embeddings Model Language Change and Typology",
author = "Xu, Yang and
Zhan…
a1da4 updated
4 years ago
-
각자 소소하게 주제를 정해주세요~