tvolk131 / fdr-finder

0 stars 0 forks source link

Add support for speech-to-text searching #16

Open tvolk131 opened 3 years ago

tvolk131 commented 3 years ago

If we run some/all podcasts through a speech-to-text algorithm such as Google Cloud Speech To Text, this will open up new possibilities for search. For example, Google's converter returns the timestamp of each word, so we could allow users to search for a word and jump to specific points in a podcast where that word is used.

tvolk131 commented 3 years ago

I'm currently in the process of writing a design document for this, but I believe the high-level functionality will look something like this...

Users can search for words or phrases and find all podcasts where that word or phrase is found somewhere in the podcast transcription.

Within a single podcast, we can search for individual instances of words or phrases and allow users to jump to the specific spots where they are mentioned.

As far as how these are both implemented... For individual words, it would make sense to use a per-word reverse index in the form of a MongoDB collection. However, for phrases this could be a bit trickier. We might be able to still use the same reverse index, and simply split the phrase up into the individual words, lookup each word, and assume that any case where all the words happen to have similar timestamps are probably a hit for that phrase.

Both of the cases above will probably use an Elasticsearch phrase suggester to first correct any typos, or possibly provide a Google-esque 'did you mean?' functionality.