Open DonaldTsang opened 4 years ago
Where is the source text dataset for the Ngrams of those 55 languages? Would like to see if it is different from https://github.com/wooorm/franc/issues/78 usage of UDHR, and if it is more accurate than them.
Apparently it uses Wikipedia but did not say how.
Where is the source text dataset for the Ngrams of those 55 languages? Would like to see if it is different from https://github.com/wooorm/franc/issues/78 usage of UDHR, and if it is more accurate than them.
Apparently it uses Wikipedia but did not say how.