WWW
As a researcher having completed a recent literature review (summarized here I would like to see if atypical speech can be synthesized as an audio completion task using audio tokens extracted from neural codecs. Recently a model called vall-e and vall-e-x were released with open source implementations not by the original authors: https://github.com/Plachtaa/VALL-E-X/tree/master
AC
The goal of this ticket would be to explore audio prompt completion as a task suitable for our use case. A few unknowns here are:
How can we keep the audio completion grounded and not have it hallucination
Getting accurate speech recognition transcription and/or finding ways to condition the generation so that it generates what is needed
WWW As a researcher having completed a recent literature review (summarized here I would like to see if atypical speech can be synthesized as an audio completion task using audio tokens extracted from neural codecs. Recently a model called vall-e and vall-e-x were released with open source implementations not by the original authors: https://github.com/Plachtaa/VALL-E-X/tree/master
AC The goal of this ticket would be to explore audio prompt completion as a task suitable for our use case. A few unknowns here are: