Open yhown589 opened 1 month ago
Yes, this issue has been known to me for several months and included in my tasklist (the first issue).
It occurs when the recognizer produces a different word segmentation than the one that's use to convert a word timeline to a sentence / segment timeline (which typically uses the cldr-segmentation
by default, jieba-wasm
for Chinese, kuromoji
for Japanese).
A lot of the time it occurs is when a dot character in the middle of the word is parsed as a sentence separator in one approach, and as a native part of the word in another. For example EH216.S
is parsed as a single word by the recognizer but as two words, where the .
is a sentence separator, in another.
I'll need to find a plan on how to prevent this from occurring. I'm not sure exactly how at the moment.
atfer aligning audio, proceed to the step of Extract timeline for part, an error was throw:
I enter the method that threw the error and found:
I guess that it maybe caused by recognizing the definition of word
text:
That allows the firm to start mass production. The EH216.S is the first flying car to receive such regulatory approval anywhere in the world. Ehang has competition in China, an EV toll from Autoflight, a Shanghai based firm, obtained a type certificate from the CAAC in March, signifying approval of its design.