Open c01o opened 2 years ago
Some of this can be addressed in the locales.json, but it needs some additional attention in the Block.js file as well. I've got an open issue for Persian that will clean up most of that logic to be more flexible and I suspect Japanese will be more easily addressed then. I'll definitely need contribution help for it, though. Determining how to split based on kanji vs hiragana/katakana will be tricky.
TBH I highly doubt implementing Japanese-phrase(文節) detector will pay, and suggest use existing libraries.
Currently stutter uses
/[\n\r\s]+/
as a delimiter, so languages not separated by itself, such as Japanese, are unusable. It seems google/budoux will do for at least Japanese, but since stutter finds word boundaries dynamically, it require some breaking changes.