kpister / oratio

Open Source Video Localization Pipeline
BSD 3-Clause "New" or "Revised" License
8 stars 3 forks source link

Tag individual sentences as speakers #8

Open kpister opened 4 years ago

kpister commented 4 years ago

We need to start considering multispeaker transitions. One step here would be tagging a sentences with an identity that would track what synthesis model is being used (e.g. en-US-wavenet-A, en-US-wavenet-B or AWS Dave). Will allow for manual implementation of multispeaker videos before we start working with people.