kingslay / KSPlayer

A video player for iOS、macOS、tvOS、visionOS , based on AVPlayer and FFmpeg, support the horizontal, vertical screen. support adjust volume, brightness and seek by slide, support subtitles.
GNU General Public License v3.0
868 stars 182 forks source link

Whisper CPP #744

Open Mr-7mdan opened 3 months ago

Mr-7mdan commented 3 months ago

Is your feature request related to a problem? Please describe. As you know, finding a suitable subtitle file is always a hassle, especially with finding the right release properly synced to the current video file.

Describe the solution you'd like I noticed that Whisper CCP was introduced and runs on tvOS, iOS, and MacOS. Which is the perfect fit for generating live captains and maybe live translation. https://github.com/ggerganov/whisper.cpp/tree/master

Describe alternatives you've considered Pls check AirCaptions app, as they already implemented this. The results are amazing https://github.com/ggerganov/whisper.cpp/discussions/1065

It would amazing to add this to your player.

kingslay commented 3 months ago

我有用apple的 SFSpeechRecognizer来做实时的文字翻译。但是翻译效果很差。希望在iOS 18. apple能够实现利用ai的能力,让这个翻译效果变好。你说的Whisper CCP 我试下。看好不好接入。

Mr-7mdan commented 3 months ago

我有用apple的 SFSpeechRecognizer来做实时的文字翻译。但是翻译效果很差。希望在iOS 18. apple能够实现利用ai的能力,让这个翻译效果变好。你说的Whisper CCP 我试下。看好不好接入。

Whisper if fairly accurate especially with English content. It is a great addition when you cannot find a subtitle. Moreover, I am thinking that instead of getting the Audio track from the video and then converting it to wave and then to split it into smaller segments, the better implementation would be to listen to the audio driver and capture the audio in a live manner as if you are providing captions from audio coming from the microphone.

I really hope you an integrate this. Thanks in advance

kingslay commented 3 months ago

我初步试了下,实时增量的语音转字幕是不行的。无法输出文字。后来我换成自己把音频汇总起来,那就可以输出文字了。但是只能输出第一段,并且视频就会变成没有声音。所以目前这个功能实现卡住了