cr08 / TwitchVault

Simplified tool to automatically download/archive VODs, clips, highlights, including associated chat logs for specified Twitch channels
GNU General Public License v3.0
18 stars 1 forks source link

Adding optional standalone Vosk transcription tool #6

Open cr08 opened 1 year ago

cr08 commented 1 year ago

Reverting an old deletion and bringing back the 0_main_vtt_generation.py file. Intent here is to keep an optional standalone tool to transcript audio to SRT from random video files not necessarily downloaded by the main tools here.

At the time of this writing, the old file has been added in but is unchanged so is unlikely to work at this point in time. More work to come on that front...

cr08 commented 1 year ago

Current plans for this:

Script is manually run. We will look at any supplied command line argument and allow multiples. Either a single file or a directory can be supplied and they can be mixed and matched. ie:

python3 opt_transcode_srt.py ~/Videos/recording1.mp4 ~/Videos/recording_dir/ /home/user/test.mp4

We will iterate through these in the order received, directories will be walked to locate videos by common file extensions (mkv, mp4, avi, mov, flv. We'll probably follow OBS' cue and watch for the file extensions it is able to export).

Once all files are found in the user provided locations, we'll iterate through each one with the transcription code as we'd expect.

Much like the primary scripts in this repo, if we locate an SRT of the same filename as the video in the same directory, we'll skip transcribing that video.