Closed yongyi-wu closed 3 years ago
Merging #1474 (d36a92f) into master (def0d70) will decrease coverage by
0.17%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #1474 +/- ##
==========================================
- Coverage 85.86% 85.68% -0.18%
==========================================
Files 52 52
Lines 6911 8266 +1355
==========================================
+ Hits 5934 7083 +1149
- Misses 977 1183 +206
Impacted Files | Coverage Δ | |
---|---|---|
src/gluonnlp/op.py | 95.78% <ø> (+0.70%) |
:arrow_up: |
src/gluonnlp/data/tokenizers/sentencepiece.py | 78.44% <100.00%> (ø) |
|
src/gluonnlp/models/electra.py | 68.90% <0.00%> (-7.94%) |
:arrow_down: |
src/gluonnlp/models/roberta.py | 90.47% <0.00%> (-3.15%) |
:arrow_down: |
src/gluonnlp/models/albert.py | 92.38% <0.00%> (-3.06%) |
:arrow_down: |
src/gluonnlp/models/gpt2.py | 95.38% <0.00%> (-2.89%) |
:arrow_down: |
src/gluonnlp/models/bert.py | 91.97% <0.00%> (-2.83%) |
:arrow_down: |
src/gluonnlp/models/bart.py | 91.44% <0.00%> (-2.31%) |
:arrow_down: |
src/gluonnlp/data/filtering.py | 78.03% <0.00%> (-0.24%) |
:arrow_down: |
src/gluonnlp/models/transformer.py | 98.89% <0.00%> (-0.05%) |
:arrow_down: |
... and 13 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update def0d70...d36a92f. Read the comment docs.
The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1474/t5/index.html
Description
This PR modifies the sanity check in SentencepieceTokenizer to ensure easier insertion of additional special tokens, which later would help adds corresponding to noise span sentinels as in T5 tokenizer. Accordingly, model and vocab for T5-base have been uploaded to S3 for some new test cases.
Checklist
Essentials
Changes
cc @dmlc/gluon-nlp-team