bert embedding默认以全零作为token type，对于QA等任务是否会修改为第一句为0，第二句为1的编码？

fastnlp / fastNLP

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

https://gitee.com/fastnlp/fastNLP

Apache License 2.0

3.06k stars 450 forks source link

Closed onebula closed 4 years ago

onebula commented 4 years ago

xuyige commented 4 years ago

对于每一个输入文本，如果其中有[SEP]标识符，则会根据[SEP]将token_type_ids修改为0101交替，如果没有[SEP]标识符，则会默认采用全0的句子编码

onebula commented 4 years ago

谢谢，可以指明一下在代码里哪一部分嘛？

xuyige commented 4 years ago

谢谢，可以指明一下在代码里哪一部分嘛？

onebula commented 4 years ago

thx