bootiful-media-mogul / mogul-service

0 stars 1 forks source link

transcription of content #6

Closed joshlong closed 2 months ago

joshlong commented 4 months ago

i should be able to get the audio from a produced podcast and then send that toi the transcription service as well.

joshlong commented 2 months ago

it’s working, so far. Basically it looks at the audio from each podcast segment’s audio file. It divides the audio along silence gaps. Pretty sure. It’s divided along milliseconds. Do I need to prove that silence gaps are correctly turned into milliseconds correctly? Anyway, it divides files, transcribes all of the gaps, and then sets the transcript on the the segment. Then, in theory, our system could show all the transcripts from all the segments in the order they’re positioned and allow users to update them. We’d need to invalidate (null out and resubmit) the segment transcript if the audio managed file is updated (written)

could we have a text editor pop open when a user clicks on a segment’s "transcript" icon?

Behind all of this is generic transcription infrastructure that I could use for all other content types

joshlong commented 2 months ago

Does Podbean support sending podcast transcripts?

joshlong commented 2 months ago

Does whisper support speaker detection?

joshlong commented 2 months ago

How would I rework this transcription infrastructure to be remote? That is, let's say I wanted to handle all this on another node. I can't serialize spring Resource types

joshlong commented 2 months ago

this is basically done. it'll take larger audio segments and divide them along their silence gaps and then send all the files to the openai transcription service. maybe one day ill rework it to use a whisper service i deploy and manage myself. but not today.