LSTM的预训练模型的分词用的是什么

dbiir / UER-py

Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo

https://github.com/dbiir/UER-py/wiki

Apache License 2.0

2.97k stars 528 forks source link

LSTM的预训练模型的分词用的是什么 #346

Open wanyuks opened 1 year ago

zhezhaoa commented 1 year ago

Unless otherwise noted, Chinese pre-trained models use BERT tokenizer and models/google_zh_vocab.txt as vocabulary (which is used in original BERT project).