jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper
MIT License
1.6k stars 177 forks source link

Wrong first word timestamp #190

Closed ngbien83 closed 1 year ago

ngbien83 commented 1 year ago

If there is background music at beginning of the file, it return wrong first word timestamp:

0
00:00:03,240 --> 00:00:21,020
<font color="#00ff00">SNS의</font> 위험성을 지적하는 2010년 미국 케이스 웨스턴 리저부대 의과 대학의 연구도 있다.

The first word begin at 00:00:20 Audio URL: https://drive.google.com/file/d/14w06qqI6Zy45RB8qYkT2Q9SzEwQ6xvaa/view?usp=sharing

jianfch commented 1 year ago

Hello, clamp_max() was recently added to the default regrouping method, so the updating to newest version should prevent this. You can also use clamp_max() directly. Using demucs=True also should also work.

ngbien83 commented 1 year ago

Thank you, clamp_max works like a charm, the problem has been solved