McCloudS / subgen

Autogenerate subtitles using OpenAI Whisper Model via Jellyfin, Plex, Emby, Tautulli, or Bazarr
MIT License
532 stars 48 forks source link

Thanks for watching! OpenAI issue #21

Closed ellisonpatterson closed 10 months ago

ellisonpatterson commented 10 months ago

At the beginning of some media, it seems to be spamming Thanks for watching! over and over. Apparently this is a common issue (the hallucination problem) and I'm wondering how it could be remedied.

More information here: https://github.com/openai/whisper/discussions/679

Example of what I'm seeing:

0
00:00:00,000 --> 00:01:13,380
Thanks for watching! Thanks for watching! Thanks for watching! Thanks for watching!

1
00:01:14,820 --> 00:01:15,780
Thanks for watching!

2
00:01:16,220 --> 00:01:23,820
Thanks for watching! Thanks for watching! Thanks for watching! Thanks for watching!

3
00:01:25,840 --> 00:01:28,120
Thanks for watching!

4
00:01:28,560 --> 00:01:29,560
Thanks for watching!

5
00:01:41,260 --> 00:01:44,260
Thanks for watching!

6
00:01:57,100 --> 00:02:04,220
Thanks for watching! Thanks for watching! Thanks for watching! Thanks for watching!

7
00:02:05,060 --> 00:02:06,560
Thanks for watching!

8
00:02:28,740 --> 00:02:29,420
Thanks for watching! Thanks for watching! Thanks for watching!

9
00:02:35,960 --> 00:02:50,260
Thanks for watching! Thanks for watching!

10
00:02:54,500 --> 00:02:59,400
Thanks for watching!

11
00:03:04,220 --> 00:03:12,780
Thanks for watching! Thanks for watching!

12
00:03:25,980 --> 00:03:29,380
Thanks for watching!

13
00:03:38,600 --> 00:03:39,900
Thanks for watching!

14
00:03:40,340 --> 00:03:40,620
Thanks for watching!

15
00:03:42,180 --> 00:03:43,080
Thanks for watching!

16
00:03:46,180 --> 00:03:46,960
Thanks for watching!

17
00:03:48,280 --> 00:03:53,220
Thanks for watching! Thanks for watching! Thanks for watching! Thanks for watching! Thanks for watching! Thanks for watching! Thanks for watching!

18
00:03:53,820 --> 00:03:54,440
Thanks for watching!

19
00:03:57,420 --> 00:03:59,240
Thanks for watching!
McCloudS commented 10 months ago

There's nothing that I can do, as it's implemented via Whisper and stable-ts and isn't a subgen implementation problem. I am considering moving to WhisperX which is supposed to be better at detecting hallucinations.