Closed gpengzhi closed 4 years ago
from texar.torch.data.tokenizers import GPT2Tokenizer
tokenizer = GPT2Tokenizer(pretrained_model_name='gpt2-small')
example = 'BART is a seq2seq model.'
ids = tokenizer.map_text_to_id(text=example)
print('original text:\n', example)
print('text -> ids -> text:\n', tokenizer.map_id_to_text(ids))
original text:
BART is a seq2seq model.
text -> ids -> text:
BART is a seq2seq model.
Merging #315 into master will not change coverage. The diff coverage is
100.00%
.
@@ Coverage Diff @@
## master #315 +/- ##
=======================================
Coverage 79.91% 79.91%
=======================================
Files 133 133
Lines 11135 11135
=======================================
Hits 8899 8899
Misses 2236 2236
Impacted Files | Coverage Δ | |
---|---|---|
texar/torch/data/tokenizers/gpt2_tokenizer.py | 89.36% <100.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 0ba18bf...c99708d. Read the comment docs.
resolve #313