Generate Transcripts - Githubissues

buttercat1791 / Git-Crunchin

For Crunch Podcast fans, created to help Ethan with his problems but who knows what else the future holds

GNU General Public License v3.0

5 stars 0 forks source link

Generate Transcripts #2

Open buttercat1791 opened 1 year ago

buttercat1791 commented 1 year ago

The audio files need to be fed to Whisper, and the resultant transcripts fed into new text files.

To generate the transcripts, the program must be able to do the following:

Create a new text file in a location specified by a runtime parameter.
Read audio data into Whisper.
Output text data from Whisper into a text file.
One audio file read in should result in one text file of output.

Questions:

In what format does Whisper accept audio?
What file format should the text files use?

Wedge29 commented 1 year ago

Didn't see any documentation about what specific file formats it takes, but from looking at the Examples in their ReadMe, it definitely takes MP3 files.

rhit-mattindc commented 1 year ago

Do you want to be handed a filepath or the file contents by the file reader? Not sure whether passing the contents is slow, python may or may not be smart about it

buttercat1791 commented 1 year ago

It looks like Whisper just wants a filepath.

I'll create a class that handles transcription. At basic level, we can instantiate the class in main, and call transcribe() with a filepath passed in, and the class will return the transcription.

To get fancier, we can see if we can add a class method to return a progress status of the transcription.

I think the transcriber class should just handle one file at a time, and the caller can call the instance multiple times to process batches of files.