Tencent / NeuralNLP-NeuralClassifier

An Open-source Neural Hierarchical Multi-label Text Classification Toolkit
Other
1.83k stars 402 forks source link

what is the doc_char dict ? #95

Closed wuyou521 closed 3 years ago

wuyou521 commented 3 years ago

Use dataset to generate dict. Size of doc_label dict is 7 Size of doc_token dict is 25028 Size of doc_char dict is 34 Size of doc_token_ngram dict is 0 Size of doc_keyword dict is 6242 Size of doc_topic dict is 0 Shrink dict over. Size of doc_label dict is 7 Size of doc_token dict is 10880 Size of doc_char dict is 33 Size of doc_token_ngram dict is 0 Size of doc_keyword dict is 6242 Size of doc_topic dict is 0

上面下面的doc_char dict不一样

wuyou521 commented 3 years ago

这个内容可以在dict_rcv1中找到 主要用来帮助你找到你的各种词的统计量