Inconsistency between direct source execution and command line execution for Chinese subtitle generation

sumanit commented 6 months ago

When executing the subtitle generation logic directly from the source code, it correctly segments sentences based on punctuation, and there are no spaces between the characters, which is the expected behavior for Chinese text. However, when running the same logic via the command line interface, the sentence segmentation appears to be inaccurate, and there are unexpected spaces between Chinese characters.

input：在忙碌和挑战中，我们的内心有时会感到疲惫。尤其是当我们发现自己脱发时，

sourceCode：

00:00:00,100 --> 00:00:01,538 在忙碌和挑战中

00:00:01,625 --> 00:00:04,237 我们的内心有时会感到疲惫

00:00:04,787 --> 00:00:07,225 尤其是当我们发现自己脱发时

command line: 00:00:00.100 --> 00:00:03.400 在忙碌和挑战中我们的内心有时会

00:00:03.413 --> 00:00:07.875 感到疲惫尤其是当我们发现自己脱发时心里

rany2 commented 2 months ago

Does this still happen?

rany2 commented 2 months ago

Nevermind I see what you mean now. It's the same issue as https://github.com/rany2/edge-tts/issues/156

rany2 / edge-tts

Inconsistency between direct source execution and command line execution for Chinese subtitle generation #167