umcu / clinlp

A Python library for performing NLP on clinical text written in Dutch
GNU General Public License v3.0
33 stars 0 forks source link

Basic speech-to-text #102

Open bramiozo opened 6 days ago

bramiozo commented 6 days ago

Describe the feature A basic speech-to-text that ingests a .wav and outputs a dictionary;

document: 
 id: XX 
 text_estimate: blaat whooop ...
 word:
   id: 0
   text_verbose:  bladiebla
   text_estimate: blaat
   start_dt: 00:00:13
   end_dt: 00:12:00
 word:
   id: 1
   text_verbose:  whoopwhoop
   text_estimate: whooop
   start_dt: 00:15:13
   end_dt: 00:20:21

A use case for the feature Fast, verbose text-to-speech with timestamps.

Would you like to be involved in development? Yo :D

Additional context

There is a concrete use case for this regarding pediatrics and language development.