SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2
MIT License
12.78k stars 1.07k forks source link

Crash at the end of long video #687

Open Tuanjiang opened 9 months ago

Tuanjiang commented 9 months ago

I'm trying to transcribe some video with large-v3 on a Windows laptop with RTX2070. If the video longer than 15 minutes, it will crash around the last thirty seconds of the video with code -1073740791 (0xC0000409). This problem only happen in Japanese video. I tried English and Chinese video but it doesn't crash.

trungkienbkhn commented 9 months ago

Hello. Can you please show the full error log and the logic code that you used?

Tuanjiang commented 9 months ago

It doesn't has any error log before crash. It only have some debug log like

DEBUG:faster_whisper:Compression ratio threshold is not met with temperature 0.4 (8.297297 > 2.400000)
DEBUG:faster_whisper:Processing segment at 22:41.040

The code I used is as followed

from faster_whisper import WhisperModel

model_name = "large-v3"
model = WhisperModel(model_name, device="cuda", compute_type="int8")
segments, info = model.transcribe(file_path, vad_filter=True, condition_on_previous_text=False)

By the way, if the precision is changed from float16 to int8, the system will not crash in videos that are just longer than 15 minutes, but it will still crash in much longer videos, such as those lasting 20 minutes.

trungkienbkhn commented 9 months ago

Can you try again with faster whisper 1.0.1 ? BTW, can you attach your Japanese video for debugging ?

Tuanjiang commented 9 months ago

After upgrading to version 1.0.1 the problem still happen. One of the audios I used and crashed is as followed 45.zip Other audios are download from these lists of video https://www.youtube.com/playlist?list=PLu7E7HFun3xC0k9Jm6Z0UUf_JNRNGEfk2 https://www.youtube.com/playlist?list=PLu7E7HFun3xDUAAKbMFe6PWW_wvmm8bLy Some video in these lists won't crash.

trungkienbkhn commented 9 months ago

I tested your audio 45.zip in linux system but I didn't receive any error. Below is the log for last thirty seconds of this audio:

...
Processing segment at 23:30.780
[1410.78s -> 1414.90s] 私たちからしたら皆さんがこれ同じことやってくださってるって
[1414.90s -> 1418.50s] ゆうのとかが見れて嬉しいよな本当に
[1418.50s -> 1423.62s] だからね2024年のフェスライブも頑張っていかないとね
[1424.50s -> 1425.44s] 頑張ろうか
[1425.44s -> 1426.28s] 頑張ろうか
[1426.28s -> 1433.28s] もっとねもっともっと楽しくってね素敵な時間をお届けできるようにね
[1433.28s -> 1435.94s] これからも頑張っていきます
[1435.94s -> 1440.76s] 私たちの成長をフェスライブを通しても見守っていただきたいなと
Processing segment at 24:00.760
[1440.76s -> 1442.46s] 頑張っていただけたら嬉しいなと思います
[1443.52s -> 1447.42s] フェスはねこすえちゃんでかほでしょ
[1448.30s -> 1452.18s] だからなんていうかその2人の魅力を伝えられる場所だから
[1452.76s -> 1453.08s] そうだね
[1453.08s -> 1454.42s] もっと伝えていきたい
[1455.10s -> 1455.84s] 頑張っていこう
[1455.84s -> 1456.90s] 頑張っていきたい
[1456.90s -> 1462.22s] はいということでねフェスライブこれからも頑張っていきたいなと思いますので
[1462.22s -> 1464.34s] 皆さんぜひぜひ見てくださいね
[1464.34s -> 1464.74s] 見てください
[1466.26s -> 1468.78s] ということで今回の企画は以上となります
Processing segment at 24:28.780
[1468.78s -> 1475.48s] このセイハス自体もね楽しい番組にしていけるようにまだまだ頑張っていきますので応援よろしくお願いします
[1475.48s -> 1476.16s] よろしくお願いします
[1477.36s -> 1480.38s] そしてチャンネル登録と高評価もよろしくお願いします
[1480.90s -> 1482.48s] 以上スリーズブーケでした
[1483.16s -> 1483.80s] バイバイ
[1486.46s -> 1491.26s] その皆さんの姿を私たち見ながらしてるんだよねパフォーマンス
[1491.26s -> 1493.80s] そうモニターにね映してもらっててね
[1493.80s -> 1494.52s] これさ
[1496.22s -> 1498.76s] すごい運命というか奇跡が起こったよね
Processing segment at 24:58.760
[1498.76s -> 1499.96s] このライブ
[1499.96s -> 1500.56s] そうだね
Total executing time:  330.2313096523285
Tuanjiang commented 9 months ago

Could it caused by the different size of memory? My computer only has 16GB memory and maybe you test it with bigger memory. I search that exit code -1073740791 (0xC0000409) usually because of memory or stack. So maybe you use bigger memory so the error doesn't occur?

trungkienbkhn commented 9 months ago

It could be so. My GPU card is RTX 3090 with 24GB memory. Can you try again with a smaller whisper model like medium ?

Tuanjiang commented 9 months ago

I tried with medium size model and it can finish well in a little bit longer video. But still finished with code -1073740791 (0xC0000409) in much longer videos