Open TakeoIschiFan opened 5 months ago
Thanks for letting me know about dump_cache
I'm currently looking into implementing this myself, but it's not as straight forward as I was hoping.
I'll let you know if I make good progress.
Here's a first attempt that gets the general idea implemented: whispersubs.lua.txt
The new logic is mainly in runCache()
with some setup under start()
enabled by the global CACHE_MODE
This is just temporary, as I plan on replacing all of the streaming code with this eventually.
As the original code was just a hack to get around the streaming to a file limitation.
At this moment the default behaviour of this script is to subtitle a stream from the beginning using a seperate yt-dl download, which means this script can't be used for live translation purposes or on non rewindable / in-progress livestreams as far as I know.
Now MPV has a built-in caching feature which can be enabled with the
cache=yes
flag. As long as the read-ahead cache is always at least as big as whisper chunk size it should be theoretically possible to save a part of the cache, process an audio chunk and display it to the user at the same time the user sees the video, essentially doing semi-live translation.Below is a proof-of-concept code snippet that uses the
dump_cache
command to flush the cache to disk, after which we can process it just like we would any other audio snippet.This piece of code works on local files and livestreams with some caveats:
dump_cache
acceptsstart
andend
parameters but sometimes there's a few seconds of padding on each side of the dumped cache for some reason, which messes up sub timing and causes overlap. I have no idea why this happens, it might be a bug in mpv.demuxer-readahead-secs
,demuxer-cache-wait
to force the cache to be a at least certain length, but streams using the yt-dl hook ignore these options, so you have to manually seek back to get the cache to be the correct amount of time in the future.If you think this functionality is a nice-to-have I can submit a pull request with a more fleshed out rendition of this idea, otherwise I'm keeping it for myself : )
Thanks