Open bakhbyergyen opened 2 years ago
hi, I wanted to know that, why zh and ja datasets are split by character? not word by word? when building a dataset, sentences can be split by words, not characters? thank you.
hi, I wanted to know that, why zh and ja datasets are split by character? not word by word? when building a dataset, sentences can be split by words, not characters? thank you.