tongpi / synthtext100kCH

佟派中文合成文本数据集是一个用来训练自然场景文本识别模型的数据集。
https://tongpi.github.io/synthtext100kCH/
44 stars 14 forks source link

How to judge the correct of the generated char_freq.cp file if Iuse your code to do Arabic frequency count? #5

Closed yingning closed 6 years ago

yingning commented 7 years ago

Dear friend .Thanks for your share code. How to judge the correct of the generated char_freq.cp file if Iuse your code to do Arabic frequency count? @doudoubean @gitter-badger @zj463261929 @wulivicte @ten2net

doudoubean commented 7 years ago

I think princle is the same. but my code is designed for Chinese。You should take into consideration that Arabic has its characteristic itself. The char_freq.cp file is serialized by cPickle,if you want to validate the correct of the generated char_freq.cp file, you can Deserialize it.