lugia19 / elevenlabslib

Full python wrapper for the elevenlabs API.
MIT License
148 stars 27 forks source link

Yield ReusableInputStreamer as numpy array or something #26

Closed olegchomp closed 8 months ago

olegchomp commented 8 months ago

Hi! I'm interesting is there any way to get converted data as chunks? In my case i need to restream data to grpc as bytes, np array or io.wav object. I've tried with elevenlabs api and got problems that data is samples and i can't to handle it properly (sluttering and etc problems). Can elevenlabslib be solution for this?

lugia19 commented 8 months ago

This is possible using voice.stream_audio_no_playback (which returns a queue containing the audio in chunks of numpy arrays) - I haven't added a version of ReusableInputStreamer which does this as I figured it wouldn't be very useful, but I could?

olegchomp commented 8 months ago

It will be great, cause for something like real-time communication it lost a lot of time on connection to API

olegchomp commented 8 months ago
b = voice.stream_audio_no_playback(text)

for i in b:
    try:
        a = i.get()
        print(a)
    except Exception as e:
        print(e)

Sorry again, but if try something like that, it return all request in one chunk. I think that it might depending on chunk size and with longer text there might be more chunks? Also this array representing mp3 format as numpy array or it already converted to wav in np format?

lugia19 commented 8 months ago

Added ReusableInputStreamerNoPlayback, which does what it says. Here's a usage example:

input_streamer_no_playback = ReusableInputStreamerNoPlayback(voice)
#The parameter must be an iterator, not a single string, as this is meant for input streaming!
audio_future, transcript_future = input_streamer_no_playback.queue_audio(write())
audio_queue = audio_future.result()
transcript_queue = transcript_future.result()

audio = audio_queue.get()
while audio is not None:
    print(audio)
    audio = audio_queue.get()

As far as the chunking goes, yes, it depends on how long the text is, and the chunk size.

And as for the format, numpy arrays are format agnostic - that is, they're their own format. You can turn them into a wav pretty easily, or any other format.

olegchomp commented 8 months ago

Thank you!