gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
http://www.gradio.app
Apache License 2.0
29.49k stars 2.19k forks source link

Audio streaming can't play before all data sent #8177

Open fancyerii opened 2 weeks ago

fancyerii commented 2 weeks ago

I want to let audio output to play before the whole wav file been sent. For example, I have a online tts model that create output audio incrementally every 100ms.

I follow the document here(https://www.gradio.app/guides/reactive-interfaces). I found the play button of html audio player is not clickable until all wav data been sent. To simulate this, I decrease the chunk size and print time information. I am pretty sure that the player can't play before all wav data sent.

import gradio as gr
from pydub import AudioSegment
from time import sleep

with gr.Blocks() as demo:
    input_audio = gr.Audio(label="Input Audio", type="filepath", format="mp3")
    with gr.Row():
        with gr.Column():
            stream_as_file_btn = gr.Button("Stream as File")
            format = gr.Radio(["wav", "mp3"], value="wav", label="Format")
            stream_as_file_output = gr.Audio(streaming=True)

            def stream_file(audio_file, format):
                audio = AudioSegment.from_file(audio_file)
                i = 0
                chunk_size = 1000
                from datetime import datetime
                while chunk_size * i < len(audio):
                    chunk = audio[chunk_size * i : chunk_size * (i + 1)]
                    i += 1
                    if chunk:
                        file = f"/tmp/{i}.{format}"
                        chunk.export(file, format=format)
                        s = datetime.today().strftime('%Y-%m-%d %H:%M:%S')
                        print(s)
                        yield file
                        sleep(0.5)

            stream_as_file_btn.click(
                stream_file, [input_audio, format], stream_as_file_output
            )

            gr.Examples(
                [["audio/cantina.wav", "wav"], ["audio/cantina.wav", "mp3"]],
                [input_audio, format],
                fn=stream_file,
                outputs=stream_as_file_output,
            )

        with gr.Column():
            stream_as_bytes_btn = gr.Button("Stream as Bytes")
            stream_as_bytes_output = gr.Audio(format="bytes", streaming=True)

            def stream_bytes(audio_file):
                chunk_size = 2_000
                from datetime import datetime
                with open(audio_file, "rb") as f:
                    while True:
                        chunk = f.read(chunk_size)
                        if chunk:
                            s = datetime.today().strftime('%Y-%m-%d %H:%M:%S')
                            print(s)
                            yield chunk
                            sleep(1)
                        else:
                            break
            stream_as_bytes_btn.click(stream_bytes, input_audio, stream_as_bytes_output)

if __name__ == "__main__":
    demo.queue().launch()

So Is there any method to achieve my goal?

ittech17 commented 4 days ago

I am also facing same issue.