Want to make a model used for chinese，can you explain the rule to mask？

AtmaHou / FewShotTagging

Code for ACL2020 paper: Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network

153 stars 29 forks source link

Closed zilinly closed 4 years ago

AtmaHou commented 4 years ago

We mask tokens according to BERT's word-piece rule: use the first word-piece as whole word representation.

For Chinese, you can simply treat all char all effective word piece.