JordieB / lippy

MIT License
0 stars 0 forks source link

[Feature Requrest] Bark TTS: Proof of concept #15

Closed JerrickB closed 11 months ago

JerrickB commented 11 months ago

Bark by Suno AI Text to Audio model. Can generate pauses, inflection, novel speech, non verbal communication (clear throat, laugh, sigh, etc.).

JerrickB commented 11 months ago

Model input: Memory is often our only connection to who we used to be. Memories are fossils, the bones left by dead versions of ourselves. More potently, our minds are a hungry audience, craving only the peaks and valleys of experience. The bland erodes, leaving behind distinctive bits to be remembered again and again. Painful or passionate, surreal or sublime, we cherish those little rocks of peak experience, polishing them with the ever-smoothing touch of recycled proxy living. In doing, like pagans praying to a sculpted mud figure, we make our memories the gods which judge our current lives.

Output: wit_mem_passion.wav

JerrickB commented 11 months ago

Fantastic results. Not absolutely perfect, but it is pretty solid. Hallucinations can be corrected for using STT and re-rendering incorrect sections of audio.