gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
http://www.gradio.app
Apache License 2.0
32.36k stars 2.42k forks source link

Feature request: Get reproduction status information from Audio component #9126

Open Shiul93 opened 1 month ago

Shiul93 commented 1 month ago

Is your feature request related to a problem? Please describe. The problem I've been having is similar to the one described in #6088. I'm developing an app that relies on the timing of text-to-speech that's being streamed with a NumPy array to an audio component. Currently, I have no way to get the timing correctly, as there are multiple parts of text to be spoken. The only way I have to predict the duration is by using the byte length. However, since some non-deterministic processes are being done in the background, the timing usually ends up misaligned. For example, if I want to execute an action, let's say 5 seconds before the stream ends, it might execute 15 seconds before, as there could be some delay.

Describe the solution you'd like It would be helpful to have access to information about the streaming in the component, such as the current playback time, remaining time, and size in bytes of the audio queue.

Some events would also be appreciated to detect if the audio queue is empty. In my case, when the audio generator is idle, playback is functionally stopped, but it still shows as being in a playing state, even though the sound playback time has stopped. As a result, none of the available callbacks are triggered in this situation.

Additional context I've been searching for a solution in the issues, but I haven't found one so far. If this has already been resolved, please feel free to close the issue.

Thank you for your attention!

Best regards, Luis Llamas

freddyaboulton commented 1 month ago

Hi @Shiul93 - I think this would be a good use case for a custom component. You can template off the audio component, gradio cc create MyAudio --template Audio, and then modify the AudioPlayer.svelte file to, for example, get the current timestamp of the audio being played. Example code:

<script>

    let time = -1;

    $: console.log("time", time)

</script>

<audio
    class="standard-player"
    class:hidden={!(value && value.is_stream)}
    controls
    autoplay={waveform_settings.autoplay}
    on:load
    bind:this={audio_player}
    bind:currentTime={time}
/>

Would this kind of approach work for you?

Shiul93 commented 1 month ago

Hello @freddyaboulton ! I will try something like that, but I'm not fluent in JS/Svelte development. I wouldn't know how to relay the data to the server executing the backend. A direct approach from the Python backend would be very nice. I'll check custom components anyway.

Thanks for your answer!

freddyaboulton commented 1 month ago

direct approach from the Python backend would be very nice.

Not sure how that would work in the general case. But we can use your custom component and use case to guide that discussion!