micahstubbs / voices-of-vr-data

a collection of scripts, data, and metadata related to the Voices of VR podcast
1 stars 1 forks source link

Create full transcripts #10

Open sirkitree opened 5 years ago

sirkitree commented 5 years ago

The task at hand is to get full transcripts of podcasts, by whatever means we can.

Once we have this, preferably an automated way of doing this, we can begin to do some automatic tagging, categorization, and other calculations on that data.

sirkitree commented 5 years ago

Here's an idea we can try using Google Docs: https://qz.com/work/1087765/how-to-transcribe-audio-fast-and-for-free-using-google-docs-voice-typing/

sirkitree commented 5 years ago

Here's a recommended transcription service for $1/minute. Even that might be expensive, considering the number of minutes accumulated, but wanted to write this down at least: https://www.rev.com/

inspired12 commented 5 years ago

Google recently launched Live Transcribe - https://play.google.com/store/apps/details?id=com.google.audio.hearing.visualization.accessibility.scribe&hl=en_US might be possible to get a transcription while listening to the podcast in the car

micahstubbs commented 5 years ago

oooh this look promising, thanks for the tip @inspired12!

micahstubbs commented 4 years ago

this might be useful as well https://ai.facebook.com/blog/online-speech-recognition-with-wav2letteranywhere/ h/t @sirkitree

The process of transcribing speech in real time from an input audio stream is known as online speech recognition. Most automatic speech recognition (ASR) research focuses on improving accuracy without the constraint of performing the task in real time. For applications like live video captioning or on-device transcriptions, however, it is important to reduce the latency between the audio and the corresponding transcription. In these cases, online speech recognition with limited time delay is needed to provide a good user experience. To solve for this need, we have developed and open-sourced wav2letter@anywhere, an inference framework that can be used to perform online speech recognition. Wav2letter@anywhere builds upon Facebook AI’s previous releases of wav2letter and wav2letter++.