Real world test - Githubissues

I have tested the video: https://www.youtube.com/watch?v=1NfFIpZocWs against models:

1) Large-v3

2) Medium model with command (since dialogues and music is sort of mixed up): stable-ts test.mp4 --model medium --model_dir D:***els\ --output testmedium.srt --language Japanese --vad True --vad_threshold 0.35 --demucs True --refine

3) Medium model using just: stable-ts test.mp4 --model medium --model_dir D:***els\ --output testmedium.srt --language Japanese

The result is: A) Large-v3 wins (not only in quality but also has best sync), B) Second place (strangely command no. 3 outputs better) and C) Last is option 2 (command taking more time)??

The transcribes were done in native language i.e Japanese and the converted to english for comparisons using Google API Version 1

jianfch / stable-ts

Real world test #288