RageAgainstThePixel / com.rest.elevenlabs

A non-official Eleven Labs voice synthesis client for Unity (UPM)
https://elevenlabs.io/?from=partnerbrown9849
MIT License
74 stars 9 forks source link

Variable Gap between playing queued streams #44

Closed yosun closed 9 months ago

yosun commented 9 months ago

Feature Request

Variable Gap between playing queued streams

Is your feature request related to a problem? Please describe.

Current queue seems to lead to distortion of voices.

Describe the solution you'd like

Maybe adding in 0.2f seconds or variable time between each could help.

StephenHodgson commented 9 months ago

You should be able to handle the queue playback in your implementation.

The API will send you clips as soon as they're loaded. Which could be faster than the playback speed.

StephenHodgson commented 9 months ago

See the example implementation in the samples scene

StephenHodgson commented 9 months ago

https://github.com/RageAgainstThePixel/com.rest.elevenlabs/blob/3ee6fdd5395d4de1a31a0bb0d7b88288c7a1337d/ElevenLabs/Packages/com.rest.elevenlabs/Samples~/TextToSpeech/TextToSpeechDemo.cs#L68-L76

yosun commented 9 months ago
private AudioSource audioSource;
private Queue<AudioClip> streamClipQueue = new Queue<AudioClip>();

private bool isPlaying = false;
private bool isTransitioning = false;

private void Start()
{
    client = new ElevenLabsClient(APIKEY_11);
    audioSource = GetComponent<AudioSource>(); // Assign the AudioSource here

    if (!string.IsNullOrEmpty(defaultVoiceId))
    {
        SetVoice(defaultVoiceId);
    }

    lifetimeCancellationTokenSource = new CancellationTokenSource();

    StartCoroutine(PlayStreamedClips());
}

private IEnumerator PlayStreamedClips()
{
    while (true)
    {
        if (!isPlaying && streamClipQueue.Count > 0)
        {
            var nextClip = streamClipQueue.Dequeue();
            Debug.Log($"Playing {nextClip.name}");

            if (audioSource.isPlaying)
            {
                isTransitioning = true;
                yield return new WaitForSeconds(0.1f); // Adjust this overlap time for smoother transition
                isTransitioning = false;
            }

            isPlaying = true;
            audioSource.clip = nextClip;
            audioSource.Play();

            yield return new WaitUntil(() => !audioSource.isPlaying || isTransitioning);
            isPlaying = false;
        }

        yield return null;
    }
}
yosun commented 9 months ago

hey.... actually this is a bug - it seems that your dequeue system for short words will swap the order. your eleven labs call should check for text queue order before playing next queued

https://github.com/RageAgainstThePixel/com.rest.elevenlabs/issues/46