gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
http://www.gradio.app
Apache License 2.0
33.48k stars 2.53k forks source link

Setting media track constraints. #5024

Open Sreerag-ibtl opened 1 year ago

Sreerag-ibtl commented 1 year ago

Is your feature request related to a problem? Please describe.
Some users of my demo are complaining about the quality of the audio recorded using the audio component. This is random and can not be replicated effectively. As a workaround, I built a simple audio recorder using plain Javascript. After some tries with certain combinations of media track constraints, I got better results. I would like to keep the same settings on the Gradio demo as well. Is it possible to accept them as arguments in the Python API?

Describe the solution you'd like
Probably accept the media track constraints as arguments in the Python API.

Additional context
MediaTrackConstraints: https://developer.mozilla.org/en-US/docs/Web/API/MediaTrackConstraints

dawoodkhan82 commented 1 year ago

@Sreerag-ibtl Thanks for reporting this issue. Although, I'm not sure we want to expose media track constraints as part of the python API. What combination of restraints are you using that ended up helping audio quality? Maybe we can look into why the quality is low in some cases.

Sreerag-ibtl commented 1 year ago

@dawoodkhan82 Currently, I use the below config:

{echoCancellation: false, noiseSuppression: false, autoGainControl: true, }

This quality issue is random. But was consistent when using earbuds across multiple os and devices.

abidlabs commented 1 year ago

@dawoodkhan82 I can see two potential solutions:

(1) We use these parameters by default. We should make sure that this doesn't break audio quality anywhere else (2) We leave this for a new community component, where people can set these configurations via parameters in the Python

WDYT?

abidlabs commented 11 months ago

Hey! We've now made it possible for Gradio users to create their own custom components -- meaning that you can write some Python and JavaScript (Svelte), and publish it as a Gradio component. You can use it in your own Gradio apps, or share it so that anyone can use it in their Gradio apps. Here are some examples of custom Gradio components:

You can see the source code for those components by clicking the "Files" icon and then clicking "src". The complete source code for the backend and frontend is visible. In particular, its very fast if you want to build off an existing component. We've put together a Guide: https://www.gradio.app/guides/five-minute-guide, and we're happy to help. Hopefully this will help address this issue.

abidlabs commented 10 months ago

cc @hannahblair do you have thoughts on whether we should include this in the Python API?

freddyaboulton commented 2 days ago

I ended up doing this for the webrtc component. I think it's reasonable to want to tweak some of these settings, for example images captured from the webcam are huge.


export async function get_video_stream(
    include_audio: boolean,
    video_source: HTMLVideoElement,
    device_id?: string
): Promise<MediaStream> {
    const size = {
        width: { ideal: 1920 },
        height: { ideal: 1440 }
    };
    ....

I think we should just allow passing a python dict into the getUserMedia call.