Open dlinx opened 2 years ago
This issue occurs when converting a sentence having ・
(U+30FB) character(s) in it. I chose to replace this character with ·
(U+00B7) character only during the conversion and I'm not having this problem anymore.
Here is a minimal code reproducing the problem (I encountered this problem using furigana
mode, but it might occur in different modes too):
const Kuroshiro = require("kuroshiro");
const KuromojiAnalyzer = require("kuroshiro-analyzer-kuromoji");
const sample = async () => {
const sentence1 = "映画『ジュラシック·パーク』の恐竜は本物そっくりだ。";
const sentence2 = "映画『ジュラシック・パーク』の恐竜は本物そっくりだ。";
const kuroshiro = new Kuroshiro();
await kuroshiro.init(new KuromojiAnalyzer());
kuroshiro.convert(sentence1, { mode: "furigana", to: "hiragana" }); // Does not throw
kuroshiro.convert(sentence2, { mode: "furigana", to: "hiragana" }); // Throws
};
sample();
You could imagine having two functions to do this job of converting back and forth:
const sanitizeJapaneseSentence = (sentence: string) => sentence.replace(/・/gi, '·');
const unsanitizeJapaneseSentence = (sentence: string) => sentence.replace(/·/gi, '・');
Hope this can help!
Getting following error while using
kuroshiro
but it is only in some cases. 90% of the time, it is not throwing any error. I do not have the input to test for this case.Stacktrace