Closed joshlong closed 2 months ago
it’s working, so far. Basically it looks at the audio from each podcast segment’s audio file. It divides the audio along silence gaps. Pretty sure. It’s divided along milliseconds. Do I need to prove that silence gaps are correctly turned into milliseconds correctly? Anyway, it divides files, transcribes all of the gaps, and then sets the transcript on the the segment. Then, in theory, our system could show all the transcripts from all the segments in the order they’re positioned and allow users to update them. We’d need to invalidate (null out and resubmit) the segment transcript if the audio managed file is updated (written)
could we have a text editor pop open when a user clicks on a segment’s "transcript" icon?
Behind all of this is generic transcription infrastructure that I could use for all other content types
Does Podbean support sending podcast transcripts?
Does whisper support speaker detection?
How would I rework this transcription infrastructure to be remote? That is, let's say I wanted to handle all this on another node. I can't serialize spring Resource types
this is basically done. it'll take larger audio segments and divide them along their silence gaps and then send all the files to the openai transcription service. maybe one day ill rework it to use a whisper service i deploy and manage myself. but not today.
i should be able to get the audio from a produced podcast and then send that toi the transcription service as well.