AlexChesser / lgtm-shipit

The project organization repository for the LGTM: ShipIt! podcast.
MIT License
3 stars 0 forks source link

Accessibility: consider a transcription service #7

Open AlexChesser opened 3 years ago

AlexChesser commented 3 years ago

theshrike79 1 day ago [–] Please provide a text version of the content. Podcasts aren't searchable, they aren't skimmable and their content isn't indexed in search engines. Text is.

AlexChesser commented 3 years ago

+1

Beyond all those very pragmatic reasons, there are also people like me whose personality type (introversion) makes us find listening to audio an emotionally draining task as it simulates interacting with people. I don't get energized by it--it is work. And why should I give you my working energy when I could just as easily have skim-read a transcript or blog post with the same information and gotten energized by the self-study approach?

THAT SAID, you said that you want to do interviews... and that is something that really calls out for an audio medium. Sure you can transcribe it, and people like me appreciate the transcription, but I recognize that the audio is the original source material and transcription is an additional expense. I'd rather have the full interview available than some highly edited transcript... but I would hope that there is at least a short summary of the content in the description of each segment.

AlexChesser commented 3 years ago

I use Descript for my podcast, £22 a month gets me ~30 hours of transcription. This probably sounds like an ad but really, it’s one of the best software products I’ve used in a very long time.

his is the most recent episode I’ve published, giant wall of text took a bit of editing but wouldn’t have been possible without Descript. https://www.mql.fm/005-working-in-martech-dan-graap

AlexChesser commented 3 years ago

prewett 12 hours ago [–] IBM Watson, Google, Azure, and AWS all have speech-to-text APIs. IBM's claims to distinguish between different voices, although when I used it a couple of years ago for Japanese it was a little lackluster. It's pretty inexpensive: IBM gives you 500 minutes free per month and is a few cents / minute thereafter; Google you gives 60 minutes free, and it's 4c/min thereafter. It's an API rather than a service, so you'd have to write a client to use it. IBM's API (and maybe the others) allows you to request time stamps on the output, so you could let people click on your transcription and seek directly to the part of the video they are interested in.I suppose you could always do it on your own machine with Sphinx, too, although I don't know how they compare to the others in accuracy.