m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
11.43k stars 1.2k forks source link

AssertionError: non-negative timestamp expected #85

Closed kanjieater closed 1 year ago

kanjieater commented 1 year ago

I'm not sure how a negative timestamp could have been generated, but I seem to have done it ๐Ÿ˜† .

When I run whisperx "/mnt/d/Editing/Audiobooks/a7/a7.wav" --language Japanese --output_dir "/mnt/d/Editing/Audiobooks/a7/" --model large-v2 --vad_filter --align_model WAV2VEC2_ASR_LARGE_LV60K_960H --hf_token some_token

I get


~~ Transcribing VAD chunk: (06:11:47.932 --> 06:12:14.290) ~~
[00:00.000 --> 00:07.000] ใƒ˜ใƒ“ใ‚ชใƒผใ‚ถใƒปใƒใƒผใ‚ฏใ‚ฝใƒณใ‚’่ฒถใ‚ใŸ ไบบ้–“ใฉใ‚‚ไปŠใฏๅ‹ใก่ช‡ใ‚‹ใŒใ‚ˆใ„ใ‚
[00:07.000 --> 00:16.960] ไธ‰ๅนดไธ‰ๅนด็ตŒใฆใฐๆ™‚ใŒๆบ€ใกใ‚‹ ใใฎๆ™‚ใ“ใๅฅดใ‚‰ใฏๅ–œใณใฎ้ ‚ใ‹ใ‚‰
[00:16.960 --> 00:26.480] ็ตถๆœ›ใฎ่ฐทๅบ•ใซ่ฝใกใ‚‹ใงใ‚ใ‚ใ†ใ‚ˆ ้ ‚ใŒ้ซ˜ใ„ใปใฉใซ่ฐทใฏๆทฑใใชใ‚Š
~~ Transcribing VAD chunk: (06:12:17.902 --> 06:12:44.345) ~~
[00:00.000 --> 00:06.880] ๅฐๅฃฐใŒ่ตทใฃใŸใใฎๅฐๅฃฐใฏๅœฐไธ‹ๆทฑใ ใ‹ใ‚‰ๆฒธใ่ตทใ‚ŠๅœฐไธŠใซๅˆฐ้”ใ™ใ‚‹ๅ‰
[00:06.880 --> 00:13.840] ใซๆถˆๆป…ใ—ใฆไบบ้–“ใŸใกใฎ่€ณใซๅฑŠใ ใ“ใจใฏใชใ‹ใฃใŸใฎใงใ‚ใ‚‹
[00:13.840 --> 00:21.400] ใƒ‘ใƒซใ‚นๆญดไธ‰็™พไบŒๅไธ€ๅนดไนๆœˆไบŒๆ—ฅ ใฎใ“ใจใงใ‚ใฃใŸ
[00:21.400 --> 00:25.440] ใŠๆฅฝใ—ใฟใ„ใŸใ ใ‘ใพใ—ใŸใงใ—ใ‚‡ใ†ใ‹ ใ“ใฎใƒ—ใƒญใ‚ฐใƒฉใƒ ใฏใ‚ชใƒผใƒ‡ใ‚ฃใƒ–ใƒซ
[00:25.440 --> 00:28.280] ใŠๅฑŠใ‘ใ—ใพใ—ใŸ
Performing alignment...
Failed to align segment ("ใ‚ถใƒปใƒใƒผใ‚ฏๆง˜ใฎ้œœในใŸใ‚‹่บซใซไธŽใˆใ‚‰ใ‚Œใ—่ก“ใฎไธ€ใคใ ใ€‚ใ‚ฐใƒผใƒซใ‚คใƒฉใƒ ใƒ„ใจใ„ใ†ใ€็ฉบๆฐ—ใŒ่›‡ใจใชใฃใฆไบบใซๅทปใใคใใ€ใ—ใ‚ๆฎบใ™ใฎใ‚ˆใ€‚ใฉใ†ใ ใ€‚ใŠๆœ›ใฟใชใ‚‰ใ€ๆฑใฎๅ…จ่บซใฎ้ชจใ‚’็ •ใใ€็”ŸใใชใŒใ‚‰ๆ—ฅๅธธใฎใ‚ฏใƒฉใ‚ฒใจใ—ใฆใใ‚Œใ‚ˆใ†ใ‹ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใƒ’ใƒซใƒกใ‚นใฏใŠใฉใ—ใซใ‹ใ‹ใฃใŸใŒใ€ใ„ใ˜ใ‚ƒใใ ใจใ‹ใงใฒใ‚“ใฟใ‚“ใฎใ‚ˆใ†ใซใ‚„ใ›ใ“ใ‘ใŸใ•ใ„ใ—ใ‚‡ใ†ใฏใ€ใตใ‚‹ใˆใ‚ใŒใ‚‹ใ‚ˆใ†ใชใ“ใจใฏใชใ‹ใฃใŸใ€‚ใ€Œใ„ใˆใ€ใฏใˆใฆ็”ณใ—ไธŠใ’ใพใ™ใŒใ€ใ‚†ใ‚ใ”ใŸใ‚“ใใ‚’ใŠใ“ใ•ใ‚Œใพใ™ใŒใ€ใ‚ใŸใใ—ใ‚ใŒใ“ใ†ใ‚„ใฃใฆใงใ‚“ใ‹ใฎใ”ใœใ‚“ใซใ—ใ“ใ†ใ„ใŸใ—ใพใ—ใŸใฎใฏใ€ใงใ‚“ใ‹ใฎใŠใ‚„ใใซใŸใกใŸใ„ใ‹ใ‚‰ใงใ”ใ–ใ„ใพใ™ใ€‚ใ€ใŠๆบœใ‚ๅพก่ฒธใ—ใ‚’"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใ™ใใ‚‹ใจใ—ใ€ๆฎฟไธ‹ใฎ็ˆถใ‚’ใŸใ‚‹ใŠๆ–นใฎ่บซใซไฝ•ไบ‹ใŒ็”Ÿใ˜ใŸใ‹ใ€ใ‚ใŸใใ—ใ‚ใฏใ‚ˆใๅญ˜ใ˜ไธŠใ’ใฆใŠใ‚Šใพใ™ใ€‚ใชใ‹ใชใ‹ใซไธ–้–“ใฎๅ™‚ใชใฉใ€ใ‚ใŸใใ—ใ‚ใฎ็Ÿฅใ‚‹ใจใ“ใ‚ใซๅŠใถใ‚‚ใฎใงใฏใ”ใ–ใ„ใพใ›ใ‚“ใ€‚ใ‚ใ–ใจใ‚‰ใ—ใใƒ•ใ‚นใƒฉใƒ–ใŒๅฃใ‚’้–‰ใ–ใ—ใŸใจใใ€ใƒ’ใƒซใƒกใ‚นใฎ่กจๆƒ…ใฏๅฎŒๅ…จใซๅค‰ใฃใฆใ„ใŸใ€‚็„กๆ„่ญ˜ใฎใ†ใกใซๅฝผใฏ่ถณใ‚’็ต„ใ‚€ใฎใ‚’ใ‚„ใ‚ใ€็Ž‰ๅบงใ‹ใ‚‰ๅŠ่บซใ‚’ไน—ใ‚Šๅ‡บใ—ใฆใ„ใŸใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ไฝ•ใ‚’ใ™ใ‚‹!ใ‚ตใƒผใƒ !ๆฎฟไธ‹ใ€ใ“ใ‚„ใคใฏๆœ€ๅˆใƒ•ใ‚นใƒฉใƒ–ๅฟใชใฉใงใฏใ”ใ–ใ„ใพใ›ใ‚“ไฝ•?ใƒ’ใƒซใƒกใ‚นใฎ่ฆ–็ทšใ‚’ๅ—ใ‘ใฆๆœ€ๅˆใƒ•ใ‚นใƒฉใƒ–ใฏ้ฉšใ„ใŸๅฆใ€้ฉšใใตใ‚Šใ‚’่ฃ…ใฃใฆใƒžใƒซใ‚บใƒใƒผใƒณใซๅ‘ผใณใ‹ใ‘ใŸใ“ใ‚Œใฏใ—ใŸใ‚Šใ‚ตใƒผใƒ–ๅฐ†่ป!ใŠใฌใ—ใจใฏใ€็ชฎๅœฐใฎไธญใงใ‚ใ‚‹ใฎใซใชใœใ€ใ“ใฎใ‚ˆใ†ใชไป•ๆ‰“ใกใ‚’ใชใ•ใ‚‹ใฎใ˜ใ‚ƒใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใƒŸใƒซใƒ•ใ‚งใ‚น็Ž‹ๅญใ‚ˆใ€ใ“ใ‚„ใคใฎๅฟ ็พฉ้ขใซใ ใพใ•ใ‚ŒใฎใŒใ‚ˆใ„ใžใ€‚ใ“ใ‚„ใคใฏใ€ใ•ใ‚ใ€ใƒ ใƒผใ‚ขใ€ใ‚ขใƒณใƒ‰ใƒฉใ‚ดใƒฉใ‚น็›ฎใซ้™คไปปใ•ใ‚Œใฆใ€ใƒžใƒซใ‚บใƒใƒผใƒณใฎ่‹ฑ่ทใซๅฐฝใใชใŒใ‚‰ใ€ใ„ใพใงใฏใŠไธปใซไป•ใˆใฆๆ–ฐไปปใ•ใ‚ŒใฆใŠใ‚‹ใ€‚ๅ่ชฌ่€…ใ˜ใ‚ƒใ€‚ๆฌกใฏใŠไธปใ‚’ๆจใฆใฆใ€ใ‚ขใƒณใƒ‰ใƒฉใ‚ดใƒญใ‚นใ‚ใฎใ‚‚ใจใซๅธฐ็”ฃใ™ใ‚‹ใ‹ใ‚‚ใ—ใ‚Œใฌใžใ€‚ไฟกใ˜ใฆใ‚ˆใ„ใฎใ‹ใช?"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใ™ในใฆใฏ็Ž‹ใจใซๅ…ฅๅ ดใ—ใฆใ‹ใ‚‰ใ ใ€‚ใ‚ฟใƒใƒŸใƒผใƒใ‚ˆใ€ใ“ใฎใ“ใจใซ้–ขใ‚ใฃใŸใ™ในใฆใฎไบบ้–“ใŒใ€ๅ‚ทๅฃใซๅกฉๆฐดใ‚’ๆตดใณใ›ใ‚‰ใ‚Œใ‚‹ๆ™‚ใŒ่ฟ‘ใฅใ„ใฆใŠใ‚‹ใ€‚ๅŒๅฅ‘็ด„ใฎใƒซใ‚ทใ‚ฟใƒ‹ใ‚ข่ปใŒ้€€ๅ ดใ—ใฆใ‚‚ใ€ใชใ‹ใชใ‹ใซๅ–œๅŠ‡ใฎๅน•ใฏไธ‹ใ‚Šใฌใ‚ใ€‚็งใซใจใฃใฆใฏๅ–œๅŠ‡ใงใฏใ”ใ–ใ„ใพใ›ใ‚“ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใชใ‚“ใ ใ€ไปŠใ•ใ‚‰ใใ‚“ใชใ“ใจใ‚’่จ€ใฃใฆใŠใ‚‹ใฎใ‹ใ€‚ๅคงๅคงไฝฟๆฎฟไธ‹ใฎๅพก่ณ‡่ณชใชใฉใ€ใจใ†ใซไฟบใฏ็Ÿฅใฃใฆใ„ใŸใžใ€‚็Ÿฅใ‚‹ใ“ใจใจไฟกใ˜ใ‚‹ใ“ใจใฏใ€ๅˆฅใฎใ‚‚ใฎใ ใจๆ€ใ†ใชใ€‚ใ‚€ใ‚ใ‚“ใใ†ใ ใจใ‚‚ใ€‚ใŸใจใˆใฐใ€ใŠไธปใฎใ‚ใ‚‹็จฎใฎๆ‰่ƒฝใซๅฏพใ—ใฆใ€ไฟบใŒ็Ÿฅใฃใฆใ„ใ‚‹ใ“ใจใจใ€ใŠไธปใŒไฟกใ˜ใฆใ„ใ‚‹ใ“ใจใฏใ€ใˆใ‚‰ใๅทฎใŒใ‚ใ‚‹ใ‹ใ‚‰ใชใ€‚่จ€ใ„ใŸใ„ใ“ใจใŒใ‚ใ‚‹ใชใ‚‰ใ€ใฏใฃใใ‚Š่จ€ใฃใŸใ‚‰ใฉใ†ใ ใ€ใƒ€ใƒผใƒชใƒฅใƒผใƒณใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใ“ใฎๆ™‚ใ€ใ‚ขใƒณใƒ‰ใƒฉใ‚ดใƒฉใ‚นใฎๅฃฐใฏใ‚ใพใ‚ŠใซไฝŽใใ€ใปใจใ‚“ใฉใ•ใ•ใ‚„ใใ‚ˆใ†ใงใ‚ใฃใŸใ€‚็ˆถๆฎบใ—ใงใฏใ‚ใ‚‹ใ€‚ใ ใŒ่จ€ใฃใฆใŠใใžใ€‚็ƒˆ็ฅžใ ใฃใŸใฎใฏใ€ใ‚ˆใ‚ˆใ‚Šๅ…„ใ€ใ‚ชใ‚นใƒญใ‚จใ‚นใฎๆ–นใ ใฃใŸใ€‚ใใ‚Œใ‚‚ๅฝ“็„ถใฎใ“ใจใ€ๅ…„ใฏ่‡ชๅˆ†ใฎๆฐ—ๅ…ˆใ‚’็ˆถ็Ž‹ใซๅฅชใ‚ใ‚ŒใŸใฎใฐใ‹ใ‚Šใ ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใŠใƒผใ„!ใ‚จใ‚ฏใƒใ‚ฟใƒผใƒŠใฎ่ก†!้ฃŸใน็‰ฉใชใ‚‰ใ“ใ“ใซใ‚ใ‚‹ใž!็Ž‹ๅคชๅญใ‚ขใƒซใ‚นใƒฉใƒณๆฎฟไธ‹ใฎใ”ๅ‘ฝไปคใงใช!่ญฐ่ซ–ใ‹ใ‚‰้‹ใ‚“ใงใใŸใฎใ !ใ•ใ‚ใฟใ‚“ใช!ๆ€ใ„ๅˆ‡ใ‚Š้ฃŸใฃใฆ!ไธŠใ‚’่ฆ‹ๅ‡บใ›!"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅคงใ—ใŸใ‚‚ใฎใ ใ€‚ใŠๅคชๅญๆฎฟไธ‹ใฏไธ€ๅคœใซใ—ใฆใ‚จใ‚ฏใƒใ‚ฟใƒผใƒŠใ‚’ๆŽŒๆกใชใ•ใฃใŸใ€‚ใ‚‚ใฏใ‚„ไฝ•่€…ใ‚‚ๆฎฟไธ‹ใฎๆจฉๅ‹ขใ‚’ๆบใ‚‹ใŒใ›ใ‚‹ใ“ใจใฏใงใใพใ„ใ€‚ใพใฃใŸใ่ฆ‹ไบ‹ใชไน—ใฃๅ–ใ‚Šใ ใฃใŸใชใ€‚ใƒŠใƒฉใ‚ตใ‚นๅฟใฏใƒใ‚ทใƒซใ‚ตใƒณใ‚’ๅ‡บใฆ10ใƒถๆœˆใงๅคฉไธ‹ใ‚’ไน—ใฃๅ–ใฃใฆใ—ใพใฃใŸใ€‚ใ‚ฏใƒใƒผใƒ‰ใŒ็‰‡็›ฎใ‚’็ดฐใ‚ใฆ็ฌ‘ใฃใŸใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Traceback (most recent call last):
  File "/home/ke/.pyenv/versions/subgen/bin/whisperx", line 8, in <module>
    sys.exit(cli())
  File "/home/ke/.pyenv/versions/3.9.9/envs/subgen/lib/python3.9/site-packages/whisperx/transcribe.py", line 723, in cli
    write_vtt(result_aligned["segments"], file=vtt)
  File "/home/ke/.pyenv/versions/3.9.9/envs/subgen/lib/python3.9/site-packages/whisperx/utils.py", line 59, in write_vtt
    f"{format_timestamp(segment['start'])} --> {format_timestamp(segment['end'])}\n"
  File "/home/ke/.pyenv/versions/3.9.9/envs/subgen/lib/python3.9/site-packages/whisperx/utils.py", line 34, in format_timestamp
    assert seconds >= 0, "non-negative timestamp expected"
AssertionError: non-negative timestamp expected

I'm happy to share the 6 hour file for testing purposes on request. I tried breaking the audio into a small clip, and tested on https://github.com/m-bain/whisperX/issues/84, but unfortunately ended up with other errors.

kanjieater commented 1 year ago

1.zip Managed to reproduce easily with a smaller test file:

~~ Transcribing VAD chunk: (19:18.106 --> 19:41.140) ~~
[00:00.000 --> 00:09.660] ใ‚นใ‚ฏใƒผใƒซใง้ฃŸในใ‚‹ใ‚ˆใ†ใซไฝœใฃใŸใŠๅผๅฝ“ใฏใ˜ใ‚ƒใ‚ๅฎถใง้ฃŸในใ‚‹ใฎใญ ใใ“็ฝฎใ„ใฆใŠใใ‹ใ‚‰้ฃŸในใ‚‰ใ‚Œใใ†ใชใ‚‰้ฃŸในใฆ
[00:09.660 --> 00:15.680] ๅฟƒใฎ็›ฎใ‚’่ฆ‹ใš ่‡ชๅˆ†ใฎๆœใฎๆ”ฏๅบฆใ‚’ๅง‹ใ‚ใ‚‹
[00:15.680 --> 00:22.200] ใŠ็ˆถใ•ใ‚“ใŒใ„ใฆใใ‚ŒใŸใ‚‰ ๅฐ‘ใ—ใฏใ‹ใฐใฃใฆใใ‚ŒใŸใ‹ใ‚‚ใ—ใ‚Œใชใ„ใฎใซ
[00:22.200 --> 00:24.200] ่‹ฆใ—ใใชใฃใŸใ€‚
~~ Transcribing VAD chunk: (19:43.385 --> 20:12.747) ~~
[00:00.000 --> 00:15.840] ๅ‹ๅƒใใฎไธก่ฆชใฎใ†ใกใ€ใŠ็ˆถใ•ใ‚“ใฎไผš็คพใฎๆ–นใŒ้€šๅ‹คใ™ใ‚‹ใฎใซ้ ใ„ใ‹ใ‚‰ใ€ใใฎๅˆ†ๆœใŒๆ—ฉใ„ใ€ๅฟƒใŒ่ตทใใ‚‹้ ƒใซใฏใ€ใ‚‚ใ†ใ„ใชใ„ใ“ใจใŒใปใจใ‚“ใฉใ ใ€‚
[00:16.960 --> 00:23.840] ใใฎใพใพใงใ„ใ‚‹ใจๆ€’ใ‚‰ใ‚Œใ‚‹ใ‹ใ‚‚ใ—ใ‚Œใชใ„ใ‹ใ‚‰ใ€้ป™ใฃใŸใพใพ้šŽๆฎตใ‚’ไธŠใ‚‹ใ€‚
[00:23.840 --> 00:29.840] ่ƒŒๅพŒใ‹ใ‚‰่ฟฝใ„่จŽใกใฎใ‚ˆใ†ใซใŸใ‚ๆฏใŒ่žใ“ใˆใŸใ€‚
Performing alignment...
Failed to align segment ("ใฟใ‚“ใชใฎ็Ÿฅใ‚‰ใชใ„ใจใ“ใ‚ใงใ€็งใŸใกใฏใ€ใ‚‚ใ†ใ€ๅ‹้”ใ€‚็งใซใ€็‰นๅˆฅใชใ“ใจใŒไฝ•ใซใ‚‚ใชใใฆใ‚‚ใ€็งใŒใ€้‹ๅ‹•็ฅž็ตŒใŒ็‰นๅˆฅ่‰ฏใใชใใฆใ‚‚ใ€้ ญใŒ่‰ฏใใชใใฆใ‚‚ใ€็งใซใ€ใฟใ‚“ใชใŒ็พจใพใ—ใŒใ‚‹ใ‚ˆใ†ใช้•ทๆ‰€ใŒใ€ๆœฌๅฝ“ใซไฝ•ใซใ‚‚ใชใใฆใ‚‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅ›ฝ้“ๆฒฟใ„ใฎใ‚นใƒผใƒ‘ใƒผใพใงใฏ่ท้›ขใŒใ‚ใฃใฆใ€่ปŠใŒใชใ‘ใ‚Œใฐใชใ‹ใชใ‹่กŒใ‘ใชใ„ใ›ใ„ใ‹ใ€ๅฟƒใฎๅฐใ•ใ„้ ƒใ‹ใ‚‰ใ€้€ฑใซไธ€ๅบฆใ€ใ†ใกใฎ่ฃใซใ‚ใ‚‹ๅ…ฌๅœ’ใซไธ‰ๆฒณ่ฃฝ่“ใฎ่ปŠใŒใ‚„ใฃใฆใใ‚‹ใ€‚่ฟ‘ๆ‰€ใซไฝใ‚€ใŠๅนดๅฏ„ใ‚Šใ‚„ใ€ๅฐใ•ใชๅญไพ›ใ‚’้€ฃใ‚ŒใŸใŠๆฏใ•ใ‚“ใŒใ€ใ“ใฎๆ›ฒใ‚’่žใ„ใฆ่ฒทใ„็‰ฉใซใ‚„ใฃใฆใใ‚‹ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅคงใใช้Ÿณๆฅฝใ‚’้Ÿฟใ‹ใ›ใ‚‹ใ‚นใƒ”ใƒผใ‚ซใƒผใŒใ†ใ‚‹ใ•ใ„ใจ่‹ฆๆƒ…ใ‚’่จ€ใ†ไบบใ‚‚ใ„ใฆใ€้จ’้Ÿณๅ•้กŒใซใชใฃใฆใ„ใ‚‹ใ€ใจใ‚‚ใ€‚้จ’้Ÿณโ€ฆใจใพใงใฏๆ€ใ‚ใชใ„ใ‘ใฉใ€ๅฟƒใ‚‚ใ“ใฎ้Ÿณใ‚’่žใใจใ€ๅฑ…ๅˆใชใใ€ไปŠใŒๅนณๆ—ฅใฎๆ˜ผ้–“ใ ใจใ„ใ†ใ“ใจใ‚’ๆ„่ญ˜ใ™ใ‚‹ใ€‚ๆ„่ญ˜ใ•ใ›ใ‚‰ใ‚Œใฆใ—ใพใ†ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅญไพ›ใŒ็ฌ‘ใ†ๅฃฐใŒ่žใ“ใˆใŸใ€‚ๅนณๆ—ฅๅˆๅ‰ไธญใฎๅไธ€ๆ™‚ใจใ„ใ†ใฎใŒใ€ใ“ใ†ใ„ใ†ๆ™‚้–“ใชใ‚“ใ ใจใ„ใ†ใ“ใจใ‚’ๅฟƒใฏใ€ๅญฆๆ กใ‚’ไผ‘ใ‚€ใ‚ˆใ†ใซใชใฃใฆๅˆใ‚ใฆ็ŸฅใฃใŸใ€‚ไธ‰ๆฒณๆ˜Ÿๅฎถใฎ่ปŠใฏๅฟƒใซใจใฃใฆๅฐๅญฆๆ กใฎ้ ƒใ‹ใ‚‰ๅคไผ‘ใฟใ‚„ๅ†ฌไผ‘ใฟใซ่ฆ‹ใ‹ใ‘ใ‚‹ใ‚‚ใฎใ ใฃใŸใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใ“ใ‚“ใช้ขจใซใ‚ซใƒผใƒ†ใƒณใ‚’ๆ•ทใ„ใฆใ€้ƒจๅฑ‹ใง่บซใ‚’ๅ›บใใ—ใฆใ„ใ‚‹ๅนณๆ—ฅใซ่ฆ‹ใ‚‹ใ‚‚ใฎใงใฏใชใ‹ใฃใŸใ€ๅŽปๅนดใพใงใฏใ€‚ๅฟƒใฏๆฏใ‚’ๆฎบใ—ใฆใ€้Ÿณใ‚’็ตžใฃใŸใƒ†ใƒฌใƒ“ใ‚’่ฆ‹ใชใŒใ‚‰ใ€ใใฎๆ˜Žใ‹ใ‚ŠใŒๅค–ใซๆผใ‚Œใฆใ„ใชใ‘ใ‚Œใฐใ„ใ„ใชใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ไธ‰ๆฒณ่–็ซใŒๆฅใชใใฆใ‚‚ใ€ๅฟƒใฎ้ƒจๅฑ‹ใฎๅ‘ใ“ใ†ใซ่ฆ‹ใˆใ‚‹ๅ…ฌๅœ’ใซใฏใ€ใ„ใคใ‚‚่ฟ‘ๆ‰€ใฎ่‹ฅใ„ใŠๆฏใ•ใ‚“ใŸใกใŒๅญไพ›ใ‚’้Šใฐใ›ใซๆฅใฆใ„ใ‚‹ใ€‚่‰ฒใจใ‚Šใฉใ‚Šใฎใƒใƒƒใ‚ฐใ‚’ใƒใƒณใƒ‰ใƒซใฎใจใ“ใ‚ใซใ‹ใ‘ใŸใƒ™ใƒ“ใƒผใ‚ซใƒผใŒใƒ™ใƒณใƒใฎใใฐใซไธฆใ‚“ใงใ„ใ‚‹ใฎใ‚’่ฆ‹ใ‚‹ใจใ€ใ‚ใ€ๅˆๅ‰ไธญใ‚‚ใ‚ใจใกใ‚‡ใฃใจใ ใ€ใจๆ€ใ†ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅๆ™‚ใ‹ใ‚‰ๅไธ€ๆ™‚ใใ‚‰ใ„ใซใ‹ใ‘ใฆ้›†ใพใ‚Šๅง‹ใ‚ใŸ่ฆชๅญใŸใกใฏใ€ๅไบŒๆ™‚ใซใฏใŠๆ˜ผใ”้ฃฏใฎใŸใ‚ใซใฟใ‚“ใชไธ€ๆ—ฆใใ“ใ‹ใ‚‰ใ„ใชใใชใ‚‹ใ€‚ใใ†ใ—ใŸใ‚‰ใ€ๅฐ‘ใ—ใ‚ซใƒผใƒ†ใƒณใŒ้–‹ใ‘ใ‚‰ใ‚Œใ‚‹ใ€‚ใ‚ซใƒผใƒ†ใƒณใฎๅธƒๅœฐใฎๆทกใ„ใ‚ชใƒฌใƒณใ‚ธ่‰ฒใ‚’้€šใ—ใ€ๆ˜ผใงใ‚‚ใใ™ใ‚“ใ ใ‚ˆใ†ใซใชใฃใŸ้ƒจๅฑ‹ใฏใšใฃใจ้Žใ”ใ—ใฆใ„ใ‚‹ใจใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅ…จ้ƒจใ€ใใ†ใ—ใŸใปใ†ใŒใ„ใ„็†็”ฑใŒใใกใ‚“ใจใ‚ใ‚‹ใ€‚ๆœใฏใ‚ซใƒผใƒ†ใƒณใ‚’้–‹ใ‘ใชใ•ใ„ใ€ใ ใจใ‹ใ€ๅญฆๆ กใซใฏๅญไพ›ใฏใฟใ‚“ใช่กŒใ‹ใชใ‘ใ‚Œใฐใชใ‚‰ใชใ„ใ€ใ ใจใ‹ใ€‚ใŠใจใจใ„ใ€ใŠๆฏใ•ใ‚“ใจ่ฆ‹ๅญฆใซ่กŒใฃใŸใ‚นใ‚ฏใƒผใƒซใซใ€ไปŠๆ—ฅใ‹ใ‚‰ๆœฌๅฝ“ใซ่กŒใ‘ใ‚‹ๆฐ—ใŒใ—ใฆใ„ใŸใ€‚ใ ใ‘ใฉโ€ฆ"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๆœ่ตทใใŸใ‚‰ใƒ€ใƒกใ ใฃใŸใ€‚ใ„ใคใ‚‚ใฎใ‚ˆใ†ใซใŠ่…นใŒ็—›ใ„ใ€‚ใ‘ใณใ‚‡ใ†ใ˜ใ‚ƒใชใ„ใ€ๆœฌๅฝ“ใซ็—›ใ„ใ€‚ใฉใ†ใ—ใฆใ‹ใ‚ใ‹ใ‚‰ใชใ‹ใฃใŸใ€‚ๆœใ€ๅญฆๆ กใซ่กŒใๆ™‚้–“ใซใชใ‚‹ใจใ€ใ‘ใณใ‚‡ใ†ใ˜ใ‚ƒใชใ„ใฎใซใ€ๆœฌๅฝ“ใซใŠ่…นใ‚„ๆ™‚ใซใฏ้ ญใ‚‚็—›ใใชใ‚‹ใฎใ ใ€‚็„ก็†ใ—ใชใใฆใ„ใ„ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใƒ›ใƒƒใƒˆใƒŸใƒซใ‚ฏใจใƒˆใƒผใ‚นใƒˆใ‚’็”จๆ„ใ—ใฆใ„ใŸใŠๆฏใ•ใ‚“ใŒใ€ๅฟƒใฎๅฃฐใ‚’่žใ„ใฆ้œฒ้ชจใซ่กจๆƒ…ใ‚’ใชใใ—ใŸใ€‚้ป™ใฃใŸใ€‚ๅฟƒใ‚’่ฆ‹ใชใ„ใ€‚ใพใ‚‹ใงๅฟƒใฎๅฃฐใŒ่žใ“ใˆใชใ‹ใฃใŸใ‚ˆใ†ใซไฟฏใ„ใฆใ€ๆนฏๆฐ—ใ‚’็ซ‹ใฆใ‚‹ใƒžใ‚ฐใ‚ซใƒƒใƒ—ใ‚’้ฃŸๅ“ใซ้‹ใถใ€‚ใใฎใพใพใ€ใ†ใ‚“ใ–ใ‚Šใ—ใŸใ‚ˆใ†ใชๅฃฐใŒโ€ฆ"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใ‚นใ‚ฏใƒผใƒซใฏๅญฆๆ กใ˜ใ‚ƒใชใ„ใฎใ‚ˆใ€‚ๆฏŽๆ—ฅใ˜ใ‚ƒใชใ„ใ—ใ€ๆฅใฆใ‚‹ไบบๆ•ฐใ‚‚ๅญฆๆ กใ‚ˆใ‚Šๅฐ‘ใชใ„ใ—ใ€‚ๅ…ˆ็”Ÿใ‚‚่‰ฏใ„ไบบใใ†ใ ใฃใŸใงใ—ใ‚‡ใ†ใ€‚่กŒใใฃใฆๅฟƒใŒ่จ€ใฃใŸใ‚“ใงใ—ใ‚‡ใ†ใ€‚ใฉใ†ใ™ใ‚‹ใฎ?่กŒใ‹ใชใ„ใฎ?ใ€ใƒคใƒ„ใ‚ฎใƒใƒคใซ่ฒฌใ‚ใ‚‰ใ‚Œใ‚‹ใ‚ˆใ†ใซ่จ€ใ‚ใ‚Œใ‚‹ใจใ€ใ‚ใ‚ใ€ใŠๆฏใ•ใ‚“ใฏ่จ€ใฃใฆๆฌฒใ—ใ„ใ‚“ใ ใ€ใจใ‚ใ‹ใ‚‹ใ€‚ใ ใ‘ใฉใ€้•ใ†ใ€‚่กŒใใŸใใชใ„ใ‚“ใ˜ใ‚ƒใชใ„ใฎใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใ‘ใณใ‚‡ใ†ใ˜ใ‚ƒใชใ„ใ€‚ใปใ‚“ใจใ†ใซใŠใชใ‹ใŒใ„ใŸใ„ใ€‚ใ“ใ“ใ‚ใŒใ“ใŸใˆใชใ„ใงใ„ใ‚‹ใจใ€ใŠใ‹ใ‚ใ•ใ‚“ใŒใ„ใ‚‰ใ„ใ‚‰ใ—ใŸใ‚ˆใ†ใซใ€ใใ‚…ใ†ใซใจใ‘ใ‚’ใใซใ—ใ ใ™ใ€‚ใ€Œใฏใ‚ใ€ใ‚‚ใ†ใ“ใ‚“ใชใ˜ใ‹ใ‚“ใ€‚ใ€ใจใ€ใ—ใŸใ†ใกใ‚’ใ™ใ‚‹ใ€‚ใ€Œใฉใ†ใ™ใ‚‹ใฎ?ใ€่ถณใŒๅ›บใพใฃใŸใ‚ˆใ†ใซใชใฃใฆๅ‹•ใ‘ใชใ„"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ไปŠๆ—ฅใฏ่กŒใ‘ใชใ„ใ‘ใฉใ€ๆฌกใซใ‚นใ‚ฏใƒผใƒซใŒใ‚ใ‚‹ๆ—ฅใซใพใŸใŠ่…นใŒ็—›ใใชใ‚‹ใ‹ใฉใ†ใ‹ใชใ‚“ใฆใ‚ใ‹ใ‚‰ใชใ„ใ€‚ใ‘ใณใ‚‡ใ†ใ˜ใ‚ƒใชใใฆใ€ๆœฌๅฝ“ใซ็—›ใ„ใ‹ใ‚‰ใŸใ ่กŒใ‘ใชใ„ใ ใ‘ใชใฎใซใ€ใ“ใ‚“ใช็†ไธๅฐฝใชใ“ใจใ‚’่žใ‹ใ‚Œใ‚‹ใชใ‚“ใฆใ€ใจๆ‚ฒใ—ใใชใฃใฆใใ‚‹ใ€‚็ญ”ใˆใชใ„ใพใพใŠๆฏใ•ใ‚“ใ‚’่ฆ‹ใฆใ„ใ‚‹ใจใ€ใŠๆฏใ•ใ‚“ใŒ"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅฐๆ‰€ใซใƒŸใƒซใ‚ฏใฎๆนฏๆฐ—ใŒใตใ‚ใฃใจๅคงใใไธŠใŒใฃใฆใ€ใ™ใใซๆฐด้Ÿณใจใจใ‚‚ใซๆถˆใˆใŸใ€‚ๆœฌๅฝ“ใฏๅพŒใง้ฃŸในใ‚ˆใ†ใจๆ€ใฃใฆใ„ใŸใ‘ใฉใ€็ญ”ใˆใ‚‹ๆš‡ใ‚‚ใชใ‹ใฃใŸใ€‚ใƒ‰ใ‚ขใฎๅ‰ใงใƒ‘ใ‚ธใƒฃใƒžๅงฟใฎใพใพๅ‹•ใ‘ใชใ„ๅฟƒใ‚’็„ก่ฆ–ใ™ใ‚‹ใ‚ˆใ†ใซใ€Œใกใ‚‡ใฃใจใฉใ„ใฆใ€‚ใ€ใจ้€šใ‚ŠๆŠœใ‘ใŸใŠๆฏใ•ใ‚“ใŒๅฅฅใฎใƒชใƒ“ใƒณใ‚ฐใซๆถˆใˆใ‚‹ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใ™ใใซใฉใ“ใ‹ใซ้›ป่ฉฑใ™ใ‚‹ๅฃฐใŒ่žใ“ใˆใฆใใŸใ€‚ใ‚ใ€ใ™ใ„ใพใ›ใ‚“ใ€‚ๅฎ‰่ฅฟใงใ™ใ‘ใ‚Œใฉใ‚‚ใ€‚ใจใ€ใใ‚ŒใพใงใฎไธๆฉŸๅซŒใ‚’ๅฏใ“ใใŽๆ‹ญใฃใŸใ‚ˆใ†ใชใ€ใ‚ˆใใ‚†ใใฎๅฃฐใŒ่žใ“ใˆใฆใใ‚‹ใ€‚ใˆใˆใใ†ใชใ‚“ใงใ™ใ‚ˆใŠ่…นใŒ็—›ใ„ใจ่จ€ใ„ๅ‡บใ—ใฆ"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("็”ณใ—่จณใ‚ใ‚Šใพใ›ใ‚“ใ€‚่ฆ‹ๅญฆใฎๆ™‚ใซใฏใ€ใ‚ใฎๅญใฎๆ–นใŒ่กŒใใŸใ„ใฃใฆไน—ใ‚Šๆฐ—ใ ใฃใŸใ‚“ใงใ™ใ‘ใฉใ€‚ใฏใ„ใ€ใฏใ„ใ€ๆœฌๅฝ“ใซใ”่ฟทๆƒ‘ใŠใ‹ใ‘ใ—ใฆใ€‚ใŠๆฏใ•ใ‚“ใŒใ‚ณใ‚ณใƒญใ‚’้€ฃใ‚Œใฆ่กŒใฃใฆใใ‚ŒใŸใ‚นใ‚ฏใƒผใƒซใฏใ€ใ‚ณใ‚ณใƒญใฎๆ•™ๅฎคใจใ„ใ†ใจใ“ใ‚ใ ใฃใŸใ€‚ๅ…ฅใ‚ŠๅฃใซๆŽ›ใ‹ใฃใŸ็œ‹ๆฟใฎไธŠใซใ€ๅญไพ›่‚ฒๆˆๆ”ฏๆดใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅฟƒใฎ็€ฌใ‚’ใƒใƒณใจๆŠผใ™ใ€‚ใ“ใ“ใฎๅๅ‰ใŒใ€ๅฟƒใฎๆ•™ๅฎคใชใฎใŒใ€ใชใ‚“ใ ใ‹็”ณใ—่จณใชใ‹ใฃใŸใ€‚ๅฟƒใจๅŒใ˜ๅๅ‰ใ€‚ใŠๆฏใ•ใ‚“ใ ใฃใฆๆฐ—ใฅใ„ใฆใ„ใ‚‹ใ ใ‚ใ†ใ€‚ใŠๆฏใ•ใ‚“ใฏใ€ใ“ใ“ใซ่‡ชๅˆ†ใ‚’้€ฃใ‚Œใฆใใ‚‹ใŸใ‚ใซใ€ๅจ˜ใซใ“ใฎๅๅ‰ใ‚’ใคใ‘ใŸใ‚ใ‘ใ˜ใ‚ƒใชใ„ใฎใซใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("่ƒธใŒใ‚ฎใƒฅใƒƒใจ็—›ใ‚“ใ ใ€‚ไธ็™ปๆ กใจๅ‘ผใฐใ‚Œใ‚‹ๅญไพ›ใŒใ€ๅญฆๆ กใฎไป–ใซ้€šใ†ๅ ดๆ‰€ใŒใ‚ใ‚‹ใจใ„ใ†ใ“ใจใ‚’ใ€ใ‚ณใ‚ณใƒญใฏ่‡ชๅˆ†ใŒใ“ใ†ใชใฃใฆๅˆใ‚ใฆ็ŸฅใฃใŸใ€‚ๅฐๅญฆๆ กใฎ้ ƒใ€ใ‚ณใ‚ณใƒญใŸใกใฎใ‚ฏใƒฉใ‚นใงๅญฆๆ กใซๆฅใชใ„ๅญใฏไธ€ไบบใ‚‚ใ„ใชใ‹ใฃใŸใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใฟใ‚“ใชใ€ๅคšๅฐ‘ใฎใ‚บใƒซไผ‘ใฟใฏ1ๆ—ฅใ‹2ๆ—ฅใ—ใฆใ„ใŸใ‹ใ‚‚ใ—ใ‚Œใชใ„ใ‘ใฉใ€ใจใซใ‹ใใ€ใ“ใ“ใซๆฅใ‚‹ใ‚ˆใ†ใชๅญใฏไธ€ไบบใ‚‚ใ„ใชใ‹ใฃใŸใ€‚ใ‚นใ‚ฏใƒผใƒซใง่ฟŽใˆใฆใใ‚ŒใŸๅ…ˆ็”ŸใŸใกใ‚‚ใ€ใฟใ‚“ใช่‡ชๅˆ†ใŸใกใฎๅฟƒใฎๆ•™ๅฎคใ‚’ใ‚นใ‚ฏใƒผใƒซใจๅ‘ผใ‚“ใงใ„ใŸใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅญไพ›็•ช็ต„ใฎๆญŒใฎใŠๅง‰ใ•ใ‚“ใฎใ‚ˆใ†ใช้›ฐๅ›ฒๆฐ—ใฎไบบใ ใฃใŸใ€‚่ƒธใซใคใ„ใŸใฒใพใ‚ใ‚Šๅž‹ใฎๅๆœญใซใ€่ชฐใ‹ๅญไพ›ใŒๆ›ธใ„ใŸใ‚‰ใ—ใ„ๅฝผๅฅณใฎไผผ้ก”็ตตใจใ€ŒๅŒ—ๅณถใ€ใจใ„ใ†ๅๅ‰ใŒๆ›ธใ„ใฆใ‚ใ‚‹ใ€‚ใ€Œใฏใ„ใ€ใจ็ญ”ใˆใ‚‹ๅฃฐใŒใ€ๆˆ‘ใชใŒใ‚‰ๅฐใ•ใไธๆ˜Ž็žญใ ใฃใŸใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใใ‚Œใซใ€็›ฎใŒใจใฆใ‚‚ๅ„ชใ—ใ„ใ€‚ๅฅฝๆ„Ÿใ‚’ๆŒใฃใŸใ‘ใฉใ€ใ“ใฎไบบใŒไปŠใฏใ‚‚ใ†ๅ’ๆฅญใ—ใฆใ€ใ‚ใฎๅญฆๆ กใฎไธญๅญฆ็”Ÿใงใชใ„ใ“ใจใŒ้–“้•ใ„ใซ็พจใพใ—ใ‹ใฃใŸใ€‚ๅฟƒใฏใ€้›ชๅ“้†้†ไธญใซ้€šใฃใฆใ‚‹ใชใ‚“ใฆใจใฆใ‚‚่จ€ใˆใชใ„ใ€‚ใพใ ใ€ๅ…ฅๅญฆใ—ใŸใฐใ‹ใ‚Šใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅ‹ๅƒใใฎไธก่ฆชใฎใ†ใกใ€ใŠ็ˆถใ•ใ‚“ใฎไผš็คพใฎๆ–นใŒ้€šๅ‹คใ™ใ‚‹ใฎใซ้ ใ„ใ‹ใ‚‰ใ€ใใฎๅˆ†ๆœใŒๆ—ฉใ„ใ€ๅฟƒใŒ่ตทใใ‚‹้ ƒใซใฏใ€ใ‚‚ใ†ใ„ใชใ„ใ“ใจใŒใปใจใ‚“ใฉใ ใ€‚ใใฎใพใพใงใ„ใ‚‹ใจๆ€’ใ‚‰ใ‚Œใ‚‹ใ‹ใ‚‚ใ—ใ‚Œใชใ„ใ‹ใ‚‰ใ€้ป™ใฃใŸใพใพ้šŽๆฎตใ‚’ไธŠใ‚‹ใ€‚่ƒŒๅพŒใ‹ใ‚‰่ฟฝใ„่จŽใกใฎใ‚ˆใ†ใซใŸใ‚ๆฏใŒ่žใ“ใˆใŸใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Traceback (most recent call last):
  File "/home/ke/.pyenv/versions/subgen/bin/whisperx", line 8, in <module>
    sys.exit(cli())
  File "/home/ke/.pyenv/versions/3.9.9/envs/subgen/lib/python3.9/site-packages/whisperx/transcribe.py", line 723, in cli
    write_vtt(result_aligned["segments"], file=vtt)
  File "/home/ke/.pyenv/versions/3.9.9/envs/subgen/lib/python3.9/site-packages/whisperx/utils.py", line 59, in write_vtt
    f"{format_timestamp(segment['start'])} --> {format_timestamp(segment['end'])}\n"
  File "/home/ke/.pyenv/versions/3.9.9/envs/subgen/lib/python3.9/site-packages/whisperx/utils.py", line 34, in format_timestamp
    assert seconds >= 0, "non-negative timestamp expected"
AssertionError: non-negative timestamp expected

And it still outputs these vtt & txt files which seem broken (too short, misaligned, missing the whisper outputs, etc): ใ‹ใŒใฟใฎๅญคๅŸŽ.zip

Pikauba commented 1 year ago

I am pretty sure the reason why this still happens here is because, instead of being based on the official repo of whisper, this project holds copies of outdated code from the offical Whisper's project.

Seems like this problem has been solved as described here: #810 and here: #914

You can take a look at the fix here: Fix infinite loop caused by incorrect timestamp tokens prediction

Which is not presents in WhisperX's code -> whisperx/decoding.py

Code from WhisperX should be refactored in order to follow the original code base to avoid this type of problems.

kanjieater commented 1 year ago

1.zip Managed to reproduce easily with a smaller test file:

~~ Transcribing VAD chunk: (19:18.106 --> 19:41.140) ~~
[00:00.000 --> 00:09.660] ใ‚นใ‚ฏใƒผใƒซใง้ฃŸในใ‚‹ใ‚ˆใ†ใซไฝœใฃใŸใŠๅผๅฝ“ใฏใ˜ใ‚ƒใ‚ๅฎถใง้ฃŸในใ‚‹ใฎใญ ใใ“็ฝฎใ„ใฆใŠใใ‹ใ‚‰้ฃŸในใ‚‰ใ‚Œใใ†ใชใ‚‰้ฃŸในใฆ
[00:09.660 --> 00:15.680] ๅฟƒใฎ็›ฎใ‚’่ฆ‹ใš ่‡ชๅˆ†ใฎๆœใฎๆ”ฏๅบฆใ‚’ๅง‹ใ‚ใ‚‹
[00:15.680 --> 00:22.200] ใŠ็ˆถใ•ใ‚“ใŒใ„ใฆใใ‚ŒใŸใ‚‰ ๅฐ‘ใ—ใฏใ‹ใฐใฃใฆใใ‚ŒใŸใ‹ใ‚‚ใ—ใ‚Œใชใ„ใฎใซ
[00:22.200 --> 00:24.200] ่‹ฆใ—ใใชใฃใŸใ€‚
~~ Transcribing VAD chunk: (19:43.385 --> 20:12.747) ~~
[00:00.000 --> 00:15.840] ๅ‹ๅƒใใฎไธก่ฆชใฎใ†ใกใ€ใŠ็ˆถใ•ใ‚“ใฎไผš็คพใฎๆ–นใŒ้€šๅ‹คใ™ใ‚‹ใฎใซ้ ใ„ใ‹ใ‚‰ใ€ใใฎๅˆ†ๆœใŒๆ—ฉใ„ใ€ๅฟƒใŒ่ตทใใ‚‹้ ƒใซใฏใ€ใ‚‚ใ†ใ„ใชใ„ใ“ใจใŒใปใจใ‚“ใฉใ ใ€‚
[00:16.960 --> 00:23.840] ใใฎใพใพใงใ„ใ‚‹ใจๆ€’ใ‚‰ใ‚Œใ‚‹ใ‹ใ‚‚ใ—ใ‚Œใชใ„ใ‹ใ‚‰ใ€้ป™ใฃใŸใพใพ้šŽๆฎตใ‚’ไธŠใ‚‹ใ€‚
[00:23.840 --> 00:29.840] ่ƒŒๅพŒใ‹ใ‚‰่ฟฝใ„่จŽใกใฎใ‚ˆใ†ใซใŸใ‚ๆฏใŒ่žใ“ใˆใŸใ€‚
Performing alignment...
Failed to align segment ("ใฟใ‚“ใชใฎ็Ÿฅใ‚‰ใชใ„ใจใ“ใ‚ใงใ€็งใŸใกใฏใ€ใ‚‚ใ†ใ€ๅ‹้”ใ€‚็งใซใ€็‰นๅˆฅใชใ“ใจใŒไฝ•ใซใ‚‚ใชใใฆใ‚‚ใ€็งใŒใ€้‹ๅ‹•็ฅž็ตŒใŒ็‰นๅˆฅ่‰ฏใใชใใฆใ‚‚ใ€้ ญใŒ่‰ฏใใชใใฆใ‚‚ใ€็งใซใ€ใฟใ‚“ใชใŒ็พจใพใ—ใŒใ‚‹ใ‚ˆใ†ใช้•ทๆ‰€ใŒใ€ๆœฌๅฝ“ใซไฝ•ใซใ‚‚ใชใใฆใ‚‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅ›ฝ้“ๆฒฟใ„ใฎใ‚นใƒผใƒ‘ใƒผใพใงใฏ่ท้›ขใŒใ‚ใฃใฆใ€่ปŠใŒใชใ‘ใ‚Œใฐใชใ‹ใชใ‹่กŒใ‘ใชใ„ใ›ใ„ใ‹ใ€ๅฟƒใฎๅฐใ•ใ„้ ƒใ‹ใ‚‰ใ€้€ฑใซไธ€ๅบฆใ€ใ†ใกใฎ่ฃใซใ‚ใ‚‹ๅ…ฌๅœ’ใซไธ‰ๆฒณ่ฃฝ่“ใฎ่ปŠใŒใ‚„ใฃใฆใใ‚‹ใ€‚่ฟ‘ๆ‰€ใซไฝใ‚€ใŠๅนดๅฏ„ใ‚Šใ‚„ใ€ๅฐใ•ใชๅญไพ›ใ‚’้€ฃใ‚ŒใŸใŠๆฏใ•ใ‚“ใŒใ€ใ“ใฎๆ›ฒใ‚’่žใ„ใฆ่ฒทใ„็‰ฉใซใ‚„ใฃใฆใใ‚‹ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅคงใใช้Ÿณๆฅฝใ‚’้Ÿฟใ‹ใ›ใ‚‹ใ‚นใƒ”ใƒผใ‚ซใƒผใŒใ†ใ‚‹ใ•ใ„ใจ่‹ฆๆƒ…ใ‚’่จ€ใ†ไบบใ‚‚ใ„ใฆใ€้จ’้Ÿณๅ•้กŒใซใชใฃใฆใ„ใ‚‹ใ€ใจใ‚‚ใ€‚้จ’้Ÿณโ€ฆใจใพใงใฏๆ€ใ‚ใชใ„ใ‘ใฉใ€ๅฟƒใ‚‚ใ“ใฎ้Ÿณใ‚’่žใใจใ€ๅฑ…ๅˆใชใใ€ไปŠใŒๅนณๆ—ฅใฎๆ˜ผ้–“ใ ใจใ„ใ†ใ“ใจใ‚’ๆ„่ญ˜ใ™ใ‚‹ใ€‚ๆ„่ญ˜ใ•ใ›ใ‚‰ใ‚Œใฆใ—ใพใ†ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅญไพ›ใŒ็ฌ‘ใ†ๅฃฐใŒ่žใ“ใˆใŸใ€‚ๅนณๆ—ฅๅˆๅ‰ไธญใฎๅไธ€ๆ™‚ใจใ„ใ†ใฎใŒใ€ใ“ใ†ใ„ใ†ๆ™‚้–“ใชใ‚“ใ ใจใ„ใ†ใ“ใจใ‚’ๅฟƒใฏใ€ๅญฆๆ กใ‚’ไผ‘ใ‚€ใ‚ˆใ†ใซใชใฃใฆๅˆใ‚ใฆ็ŸฅใฃใŸใ€‚ไธ‰ๆฒณๆ˜Ÿๅฎถใฎ่ปŠใฏๅฟƒใซใจใฃใฆๅฐๅญฆๆ กใฎ้ ƒใ‹ใ‚‰ๅคไผ‘ใฟใ‚„ๅ†ฌไผ‘ใฟใซ่ฆ‹ใ‹ใ‘ใ‚‹ใ‚‚ใฎใ ใฃใŸใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใ“ใ‚“ใช้ขจใซใ‚ซใƒผใƒ†ใƒณใ‚’ๆ•ทใ„ใฆใ€้ƒจๅฑ‹ใง่บซใ‚’ๅ›บใใ—ใฆใ„ใ‚‹ๅนณๆ—ฅใซ่ฆ‹ใ‚‹ใ‚‚ใฎใงใฏใชใ‹ใฃใŸใ€ๅŽปๅนดใพใงใฏใ€‚ๅฟƒใฏๆฏใ‚’ๆฎบใ—ใฆใ€้Ÿณใ‚’็ตžใฃใŸใƒ†ใƒฌใƒ“ใ‚’่ฆ‹ใชใŒใ‚‰ใ€ใใฎๆ˜Žใ‹ใ‚ŠใŒๅค–ใซๆผใ‚Œใฆใ„ใชใ‘ใ‚Œใฐใ„ใ„ใชใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ไธ‰ๆฒณ่–็ซใŒๆฅใชใใฆใ‚‚ใ€ๅฟƒใฎ้ƒจๅฑ‹ใฎๅ‘ใ“ใ†ใซ่ฆ‹ใˆใ‚‹ๅ…ฌๅœ’ใซใฏใ€ใ„ใคใ‚‚่ฟ‘ๆ‰€ใฎ่‹ฅใ„ใŠๆฏใ•ใ‚“ใŸใกใŒๅญไพ›ใ‚’้Šใฐใ›ใซๆฅใฆใ„ใ‚‹ใ€‚่‰ฒใจใ‚Šใฉใ‚Šใฎใƒใƒƒใ‚ฐใ‚’ใƒใƒณใƒ‰ใƒซใฎใจใ“ใ‚ใซใ‹ใ‘ใŸใƒ™ใƒ“ใƒผใ‚ซใƒผใŒใƒ™ใƒณใƒใฎใใฐใซไธฆใ‚“ใงใ„ใ‚‹ใฎใ‚’่ฆ‹ใ‚‹ใจใ€ใ‚ใ€ๅˆๅ‰ไธญใ‚‚ใ‚ใจใกใ‚‡ใฃใจใ ใ€ใจๆ€ใ†ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅๆ™‚ใ‹ใ‚‰ๅไธ€ๆ™‚ใใ‚‰ใ„ใซใ‹ใ‘ใฆ้›†ใพใ‚Šๅง‹ใ‚ใŸ่ฆชๅญใŸใกใฏใ€ๅไบŒๆ™‚ใซใฏใŠๆ˜ผใ”้ฃฏใฎใŸใ‚ใซใฟใ‚“ใชไธ€ๆ—ฆใใ“ใ‹ใ‚‰ใ„ใชใใชใ‚‹ใ€‚ใใ†ใ—ใŸใ‚‰ใ€ๅฐ‘ใ—ใ‚ซใƒผใƒ†ใƒณใŒ้–‹ใ‘ใ‚‰ใ‚Œใ‚‹ใ€‚ใ‚ซใƒผใƒ†ใƒณใฎๅธƒๅœฐใฎๆทกใ„ใ‚ชใƒฌใƒณใ‚ธ่‰ฒใ‚’้€šใ—ใ€ๆ˜ผใงใ‚‚ใใ™ใ‚“ใ ใ‚ˆใ†ใซใชใฃใŸ้ƒจๅฑ‹ใฏใšใฃใจ้Žใ”ใ—ใฆใ„ใ‚‹ใจใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅ…จ้ƒจใ€ใใ†ใ—ใŸใปใ†ใŒใ„ใ„็†็”ฑใŒใใกใ‚“ใจใ‚ใ‚‹ใ€‚ๆœใฏใ‚ซใƒผใƒ†ใƒณใ‚’้–‹ใ‘ใชใ•ใ„ใ€ใ ใจใ‹ใ€ๅญฆๆ กใซใฏๅญไพ›ใฏใฟใ‚“ใช่กŒใ‹ใชใ‘ใ‚Œใฐใชใ‚‰ใชใ„ใ€ใ ใจใ‹ใ€‚ใŠใจใจใ„ใ€ใŠๆฏใ•ใ‚“ใจ่ฆ‹ๅญฆใซ่กŒใฃใŸใ‚นใ‚ฏใƒผใƒซใซใ€ไปŠๆ—ฅใ‹ใ‚‰ๆœฌๅฝ“ใซ่กŒใ‘ใ‚‹ๆฐ—ใŒใ—ใฆใ„ใŸใ€‚ใ ใ‘ใฉโ€ฆ"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๆœ่ตทใใŸใ‚‰ใƒ€ใƒกใ ใฃใŸใ€‚ใ„ใคใ‚‚ใฎใ‚ˆใ†ใซใŠ่…นใŒ็—›ใ„ใ€‚ใ‘ใณใ‚‡ใ†ใ˜ใ‚ƒใชใ„ใ€ๆœฌๅฝ“ใซ็—›ใ„ใ€‚ใฉใ†ใ—ใฆใ‹ใ‚ใ‹ใ‚‰ใชใ‹ใฃใŸใ€‚ๆœใ€ๅญฆๆ กใซ่กŒใๆ™‚้–“ใซใชใ‚‹ใจใ€ใ‘ใณใ‚‡ใ†ใ˜ใ‚ƒใชใ„ใฎใซใ€ๆœฌๅฝ“ใซใŠ่…นใ‚„ๆ™‚ใซใฏ้ ญใ‚‚็—›ใใชใ‚‹ใฎใ ใ€‚็„ก็†ใ—ใชใใฆใ„ใ„ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใƒ›ใƒƒใƒˆใƒŸใƒซใ‚ฏใจใƒˆใƒผใ‚นใƒˆใ‚’็”จๆ„ใ—ใฆใ„ใŸใŠๆฏใ•ใ‚“ใŒใ€ๅฟƒใฎๅฃฐใ‚’่žใ„ใฆ้œฒ้ชจใซ่กจๆƒ…ใ‚’ใชใใ—ใŸใ€‚้ป™ใฃใŸใ€‚ๅฟƒใ‚’่ฆ‹ใชใ„ใ€‚ใพใ‚‹ใงๅฟƒใฎๅฃฐใŒ่žใ“ใˆใชใ‹ใฃใŸใ‚ˆใ†ใซไฟฏใ„ใฆใ€ๆนฏๆฐ—ใ‚’็ซ‹ใฆใ‚‹ใƒžใ‚ฐใ‚ซใƒƒใƒ—ใ‚’้ฃŸๅ“ใซ้‹ใถใ€‚ใใฎใพใพใ€ใ†ใ‚“ใ–ใ‚Šใ—ใŸใ‚ˆใ†ใชๅฃฐใŒโ€ฆ"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใ‚นใ‚ฏใƒผใƒซใฏๅญฆๆ กใ˜ใ‚ƒใชใ„ใฎใ‚ˆใ€‚ๆฏŽๆ—ฅใ˜ใ‚ƒใชใ„ใ—ใ€ๆฅใฆใ‚‹ไบบๆ•ฐใ‚‚ๅญฆๆ กใ‚ˆใ‚Šๅฐ‘ใชใ„ใ—ใ€‚ๅ…ˆ็”Ÿใ‚‚่‰ฏใ„ไบบใใ†ใ ใฃใŸใงใ—ใ‚‡ใ†ใ€‚่กŒใใฃใฆๅฟƒใŒ่จ€ใฃใŸใ‚“ใงใ—ใ‚‡ใ†ใ€‚ใฉใ†ใ™ใ‚‹ใฎ?่กŒใ‹ใชใ„ใฎ?ใ€ใƒคใƒ„ใ‚ฎใƒใƒคใซ่ฒฌใ‚ใ‚‰ใ‚Œใ‚‹ใ‚ˆใ†ใซ่จ€ใ‚ใ‚Œใ‚‹ใจใ€ใ‚ใ‚ใ€ใŠๆฏใ•ใ‚“ใฏ่จ€ใฃใฆๆฌฒใ—ใ„ใ‚“ใ ใ€ใจใ‚ใ‹ใ‚‹ใ€‚ใ ใ‘ใฉใ€้•ใ†ใ€‚่กŒใใŸใใชใ„ใ‚“ใ˜ใ‚ƒใชใ„ใฎใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใ‘ใณใ‚‡ใ†ใ˜ใ‚ƒใชใ„ใ€‚ใปใ‚“ใจใ†ใซใŠใชใ‹ใŒใ„ใŸใ„ใ€‚ใ“ใ“ใ‚ใŒใ“ใŸใˆใชใ„ใงใ„ใ‚‹ใจใ€ใŠใ‹ใ‚ใ•ใ‚“ใŒใ„ใ‚‰ใ„ใ‚‰ใ—ใŸใ‚ˆใ†ใซใ€ใใ‚…ใ†ใซใจใ‘ใ‚’ใใซใ—ใ ใ™ใ€‚ใ€Œใฏใ‚ใ€ใ‚‚ใ†ใ“ใ‚“ใชใ˜ใ‹ใ‚“ใ€‚ใ€ใจใ€ใ—ใŸใ†ใกใ‚’ใ™ใ‚‹ใ€‚ใ€Œใฉใ†ใ™ใ‚‹ใฎ?ใ€่ถณใŒๅ›บใพใฃใŸใ‚ˆใ†ใซใชใฃใฆๅ‹•ใ‘ใชใ„"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ไปŠๆ—ฅใฏ่กŒใ‘ใชใ„ใ‘ใฉใ€ๆฌกใซใ‚นใ‚ฏใƒผใƒซใŒใ‚ใ‚‹ๆ—ฅใซใพใŸใŠ่…นใŒ็—›ใใชใ‚‹ใ‹ใฉใ†ใ‹ใชใ‚“ใฆใ‚ใ‹ใ‚‰ใชใ„ใ€‚ใ‘ใณใ‚‡ใ†ใ˜ใ‚ƒใชใใฆใ€ๆœฌๅฝ“ใซ็—›ใ„ใ‹ใ‚‰ใŸใ ่กŒใ‘ใชใ„ใ ใ‘ใชใฎใซใ€ใ“ใ‚“ใช็†ไธๅฐฝใชใ“ใจใ‚’่žใ‹ใ‚Œใ‚‹ใชใ‚“ใฆใ€ใจๆ‚ฒใ—ใใชใฃใฆใใ‚‹ใ€‚็ญ”ใˆใชใ„ใพใพใŠๆฏใ•ใ‚“ใ‚’่ฆ‹ใฆใ„ใ‚‹ใจใ€ใŠๆฏใ•ใ‚“ใŒ"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅฐๆ‰€ใซใƒŸใƒซใ‚ฏใฎๆนฏๆฐ—ใŒใตใ‚ใฃใจๅคงใใไธŠใŒใฃใฆใ€ใ™ใใซๆฐด้Ÿณใจใจใ‚‚ใซๆถˆใˆใŸใ€‚ๆœฌๅฝ“ใฏๅพŒใง้ฃŸในใ‚ˆใ†ใจๆ€ใฃใฆใ„ใŸใ‘ใฉใ€็ญ”ใˆใ‚‹ๆš‡ใ‚‚ใชใ‹ใฃใŸใ€‚ใƒ‰ใ‚ขใฎๅ‰ใงใƒ‘ใ‚ธใƒฃใƒžๅงฟใฎใพใพๅ‹•ใ‘ใชใ„ๅฟƒใ‚’็„ก่ฆ–ใ™ใ‚‹ใ‚ˆใ†ใซใ€Œใกใ‚‡ใฃใจใฉใ„ใฆใ€‚ใ€ใจ้€šใ‚ŠๆŠœใ‘ใŸใŠๆฏใ•ใ‚“ใŒๅฅฅใฎใƒชใƒ“ใƒณใ‚ฐใซๆถˆใˆใ‚‹ใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใ™ใใซใฉใ“ใ‹ใซ้›ป่ฉฑใ™ใ‚‹ๅฃฐใŒ่žใ“ใˆใฆใใŸใ€‚ใ‚ใ€ใ™ใ„ใพใ›ใ‚“ใ€‚ๅฎ‰่ฅฟใงใ™ใ‘ใ‚Œใฉใ‚‚ใ€‚ใจใ€ใใ‚ŒใพใงใฎไธๆฉŸๅซŒใ‚’ๅฏใ“ใใŽๆ‹ญใฃใŸใ‚ˆใ†ใชใ€ใ‚ˆใใ‚†ใใฎๅฃฐใŒ่žใ“ใˆใฆใใ‚‹ใ€‚ใˆใˆใใ†ใชใ‚“ใงใ™ใ‚ˆใŠ่…นใŒ็—›ใ„ใจ่จ€ใ„ๅ‡บใ—ใฆ"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("็”ณใ—่จณใ‚ใ‚Šใพใ›ใ‚“ใ€‚่ฆ‹ๅญฆใฎๆ™‚ใซใฏใ€ใ‚ใฎๅญใฎๆ–นใŒ่กŒใใŸใ„ใฃใฆไน—ใ‚Šๆฐ—ใ ใฃใŸใ‚“ใงใ™ใ‘ใฉใ€‚ใฏใ„ใ€ใฏใ„ใ€ๆœฌๅฝ“ใซใ”่ฟทๆƒ‘ใŠใ‹ใ‘ใ—ใฆใ€‚ใŠๆฏใ•ใ‚“ใŒใ‚ณใ‚ณใƒญใ‚’้€ฃใ‚Œใฆ่กŒใฃใฆใใ‚ŒใŸใ‚นใ‚ฏใƒผใƒซใฏใ€ใ‚ณใ‚ณใƒญใฎๆ•™ๅฎคใจใ„ใ†ใจใ“ใ‚ใ ใฃใŸใ€‚ๅ…ฅใ‚ŠๅฃใซๆŽ›ใ‹ใฃใŸ็œ‹ๆฟใฎไธŠใซใ€ๅญไพ›่‚ฒๆˆๆ”ฏๆดใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅฟƒใฎ็€ฌใ‚’ใƒใƒณใจๆŠผใ™ใ€‚ใ“ใ“ใฎๅๅ‰ใŒใ€ๅฟƒใฎๆ•™ๅฎคใชใฎใŒใ€ใชใ‚“ใ ใ‹็”ณใ—่จณใชใ‹ใฃใŸใ€‚ๅฟƒใจๅŒใ˜ๅๅ‰ใ€‚ใŠๆฏใ•ใ‚“ใ ใฃใฆๆฐ—ใฅใ„ใฆใ„ใ‚‹ใ ใ‚ใ†ใ€‚ใŠๆฏใ•ใ‚“ใฏใ€ใ“ใ“ใซ่‡ชๅˆ†ใ‚’้€ฃใ‚Œใฆใใ‚‹ใŸใ‚ใซใ€ๅจ˜ใซใ“ใฎๅๅ‰ใ‚’ใคใ‘ใŸใ‚ใ‘ใ˜ใ‚ƒใชใ„ใฎใซใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("่ƒธใŒใ‚ฎใƒฅใƒƒใจ็—›ใ‚“ใ ใ€‚ไธ็™ปๆ กใจๅ‘ผใฐใ‚Œใ‚‹ๅญไพ›ใŒใ€ๅญฆๆ กใฎไป–ใซ้€šใ†ๅ ดๆ‰€ใŒใ‚ใ‚‹ใจใ„ใ†ใ“ใจใ‚’ใ€ใ‚ณใ‚ณใƒญใฏ่‡ชๅˆ†ใŒใ“ใ†ใชใฃใฆๅˆใ‚ใฆ็ŸฅใฃใŸใ€‚ๅฐๅญฆๆ กใฎ้ ƒใ€ใ‚ณใ‚ณใƒญใŸใกใฎใ‚ฏใƒฉใ‚นใงๅญฆๆ กใซๆฅใชใ„ๅญใฏไธ€ไบบใ‚‚ใ„ใชใ‹ใฃใŸใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใฟใ‚“ใชใ€ๅคšๅฐ‘ใฎใ‚บใƒซไผ‘ใฟใฏ1ๆ—ฅใ‹2ๆ—ฅใ—ใฆใ„ใŸใ‹ใ‚‚ใ—ใ‚Œใชใ„ใ‘ใฉใ€ใจใซใ‹ใใ€ใ“ใ“ใซๆฅใ‚‹ใ‚ˆใ†ใชๅญใฏไธ€ไบบใ‚‚ใ„ใชใ‹ใฃใŸใ€‚ใ‚นใ‚ฏใƒผใƒซใง่ฟŽใˆใฆใใ‚ŒใŸๅ…ˆ็”ŸใŸใกใ‚‚ใ€ใฟใ‚“ใช่‡ชๅˆ†ใŸใกใฎๅฟƒใฎๆ•™ๅฎคใ‚’ใ‚นใ‚ฏใƒผใƒซใจๅ‘ผใ‚“ใงใ„ใŸใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅญไพ›็•ช็ต„ใฎๆญŒใฎใŠๅง‰ใ•ใ‚“ใฎใ‚ˆใ†ใช้›ฐๅ›ฒๆฐ—ใฎไบบใ ใฃใŸใ€‚่ƒธใซใคใ„ใŸใฒใพใ‚ใ‚Šๅž‹ใฎๅๆœญใซใ€่ชฐใ‹ๅญไพ›ใŒๆ›ธใ„ใŸใ‚‰ใ—ใ„ๅฝผๅฅณใฎไผผ้ก”็ตตใจใ€ŒๅŒ—ๅณถใ€ใจใ„ใ†ๅๅ‰ใŒๆ›ธใ„ใฆใ‚ใ‚‹ใ€‚ใ€Œใฏใ„ใ€ใจ็ญ”ใˆใ‚‹ๅฃฐใŒใ€ๆˆ‘ใชใŒใ‚‰ๅฐใ•ใไธๆ˜Ž็žญใ ใฃใŸใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ใใ‚Œใซใ€็›ฎใŒใจใฆใ‚‚ๅ„ชใ—ใ„ใ€‚ๅฅฝๆ„Ÿใ‚’ๆŒใฃใŸใ‘ใฉใ€ใ“ใฎไบบใŒไปŠใฏใ‚‚ใ†ๅ’ๆฅญใ—ใฆใ€ใ‚ใฎๅญฆๆ กใฎไธญๅญฆ็”Ÿใงใชใ„ใ“ใจใŒ้–“้•ใ„ใซ็พจใพใ—ใ‹ใฃใŸใ€‚ๅฟƒใฏใ€้›ชๅ“้†้†ไธญใซ้€šใฃใฆใ‚‹ใชใ‚“ใฆใจใฆใ‚‚่จ€ใˆใชใ„ใ€‚ใพใ ใ€ๅ…ฅๅญฆใ—ใŸใฐใ‹ใ‚Šใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Failed to align segment ("ๅ‹ๅƒใใฎไธก่ฆชใฎใ†ใกใ€ใŠ็ˆถใ•ใ‚“ใฎไผš็คพใฎๆ–นใŒ้€šๅ‹คใ™ใ‚‹ใฎใซ้ ใ„ใ‹ใ‚‰ใ€ใใฎๅˆ†ๆœใŒๆ—ฉใ„ใ€ๅฟƒใŒ่ตทใใ‚‹้ ƒใซใฏใ€ใ‚‚ใ†ใ„ใชใ„ใ“ใจใŒใปใจใ‚“ใฉใ ใ€‚ใใฎใพใพใงใ„ใ‚‹ใจๆ€’ใ‚‰ใ‚Œใ‚‹ใ‹ใ‚‚ใ—ใ‚Œใชใ„ใ‹ใ‚‰ใ€้ป™ใฃใŸใพใพ้šŽๆฎตใ‚’ไธŠใ‚‹ใ€‚่ƒŒๅพŒใ‹ใ‚‰่ฟฝใ„่จŽใกใฎใ‚ˆใ†ใซใŸใ‚ๆฏใŒ่žใ“ใˆใŸใ€‚"): no characters in this segment found in model dictionary, resorting to original...
Traceback (most recent call last):
  File "/home/ke/.pyenv/versions/subgen/bin/whisperx", line 8, in <module>
    sys.exit(cli())
  File "/home/ke/.pyenv/versions/3.9.9/envs/subgen/lib/python3.9/site-packages/whisperx/transcribe.py", line 723, in cli
    write_vtt(result_aligned["segments"], file=vtt)
  File "/home/ke/.pyenv/versions/3.9.9/envs/subgen/lib/python3.9/site-packages/whisperx/utils.py", line 59, in write_vtt
    f"{format_timestamp(segment['start'])} --> {format_timestamp(segment['end'])}\n"
  File "/home/ke/.pyenv/versions/3.9.9/envs/subgen/lib/python3.9/site-packages/whisperx/utils.py", line 34, in format_timestamp
    assert seconds >= 0, "non-negative timestamp expected"
AssertionError: non-negative timestamp expected

And it still outputs these vtt & txt files which seem broken (too short, misaligned, missing the whisper outputs, etc): ใ‹ใŒใฟใฎๅญคๅŸŽ.zip

I managed to move past this issue for this file using the suggestion here: https://github.com/m-bain/whisperX/issues/84 Use language codes instead of full words like Whisper does. --language ja not --language Japanese

I will verify the original file can complete after I see where #84 lands regarding Failed to align segment the broken vtt it produces

Infinitay commented 1 year ago

Getting the same assertion error. Windows 10, latest whisperx installed via pipx but I merged the changes from the existing PR #91 on top. Python 3.9.9, Torch for CUDA 11.7

Command: whisperx --vad_filter --parallel_bs 2 --model medium --language Korean --align_model wav2vec2-xls-r-300m-korean --output_dir "update_test/" korean-convo-lingo.mp3 note I am using VAD here

Align Model: https://huggingface.co/w11wo/wav2vec2-xls-r-300m-korean

Audio Clip: https://www.youtube.com/watch?v=PcysuLjtTeo yt-dlp https://www.youtube.com/watch?v=PcysuLjtTeo --extract-audio --audio-format mp3 -o "korean-convo-lingo.mp3"

[14:27.295 --> 14:29.795]  ์ด ํ”ผ์ž ๋ง›์žˆ์Šต๋‹ˆ๋‹ค.
[14:29.795 --> 14:31.795]  ์™€์šฐ, ์ •๋ง ๋†€๋ž์Šต๋‹ˆ๋‹ค.
[14:31.795 --> 14:34.955]  ์ €๋Š” ์ด ํ”ผ์ž ์ œ์ผ ๋ง›์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.
Performing alignment...
Traceback (most recent call last):
  File "C:\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "c:\users\infinitay\.local\bin\whisperx.exe\__main__.py", line 7, in <module>
  File "C:\Users\infinitay\.local\pipx\venvs\whisperx\lib\site-packages\whisperx\transcribe.py", line 739, in cli
    write_vtt(result_aligned["segments"], file=vtt)
  File "C:\Users\infinitay\.local\pipx\venvs\whisperx\lib\site-packages\whisperx\utils.py", line 59, in write_vtt
    f"{format_timestamp(segment['start'])} --> {format_timestamp(segment['end'])}\n"
  File "C:\Users\infinitay\.local\pipx\venvs\whisperx\lib\site-packages\whisperx\utils.py", line 34, in format_timestamp
    assert seconds >= 0, "non-negative timestamp expected"
AssertionError: non-negative timestamp expected

EDIT: When I was testing other audio clips I realized that the behavior differs depending on the aligh model I am using. With the same audio clip and command linked above, I changed the model to use wav2vec2-large-xls-r-1b-korean-sample5 instead and it ran without any errors.


The same behavior of one model working but another not occurs again for another audio clip of a song. Audio Clip: https://www.youtube.com/watch?v=lxPndeAzfwI Model that failed: wav2vec2-large-xls-r-1b-korean-sample5 (model was linked above) Model that passed: wav2vec2-xls-r-300m-korean (model was linked above)


Found another example of one model working and another resulting in "non-negative timestamp expected" Audio Clip: https://www.youtube.com/watch?v=Z_NaYKUR3sM Model that failed: wav2vec2-large-xls-r-1b-korean-sample5 (model was linked above) Model that passed: wav2vec2-xls-r-300m-korean (model was linked above)

m-bain commented 1 year ago

whisper fix and VAD filtering means this cant happen any more in theory :')