jellying commented 5 years ago

717

bwanglzu commented 5 years ago

会破坏整个项目的代码结构。

jellying commented 5 years ago

会破坏整个项目的代码结构。

最关键的问题想用bert就要加载bert自身的vocab，而不是由vocabulary这个unit生成，需要想办法设计一下怎么兼容bert的词表和预处理。

codecov-io commented 5 years ago

Codecov Report

Merging #722 into 2.2-dev will increase coverage by 0.13%. The diff coverage is 96.87%.

@@             Coverage Diff             @@
##           2.2-dev     #722      +/-   ##
===========================================
+ Coverage    94.31%   94.45%   +0.13%     
===========================================
  Files           98      101       +3     
  Lines         3378     3570     +192     
===========================================
+ Hits          3186     3372     +186     
- Misses         192      198       +6

Impacted Files	Coverage Δ
matchzoo/preprocessors/units/vocabulary.py	`100% <100%> (ø)`	:arrow_up:
matchzoo/preprocessors/build_vocab_unit.py	`100% <100%> (ø)`	:arrow_up:
matchzoo/preprocessors/units/bert_clean.py	`94.11% <94.11%> (ø)`
matchzoo/preprocessors/bert_preprocessor.py	`96.22% <96.22%> (ø)`
matchzoo/preprocessors/units/tokenize.py	`96.72% <96.42%> (-3.28%)`	:arrow_down:
matchzoo/utils/bert_utils.py	`97.67% <97.67%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 6b09cde...6693bd9. Read the comment docs.

bwanglzu commented 5 years ago

@uduse can you review?

NTMC-Community / MatchZoo

Add berttokenize unit. #722

717

Codecov Report