Don't require user gesture when capturing user media

beaufortfrancois commented 5 years ago

Video meetings web apps would benefit from automatic Picture-in-Picture behavior when user switches back and forth between web app and other applications/tabs. This is currently not possible with the user gesture requirement in https://wicg.github.io/picture-in-picture/#request-pip (step 6).

I'm proposing to not enforce the user gesture requirement if document is capturing user media with getUserMedia() when requesting Picture-in-Picture.

What do you think @mounirlamouri?

const video = document.createElement('video');
video.srcObject = await navigator.mediaDevices.getUserMedia({ video: true });

window.onblur = _ => { video.requestPictureInPicture(); }
window.onfocus = _ => { document.exitPictureInPicture(); }

Preview | Diff

mounirlamouri commented 5 years ago

My concerns with this is that it is very specific to some use cases and wouldn't allow a whole class of use cases to work. Also, I do not know about iOS but on Android, I do not think this could be implemented because the Android API expect Picture-in-Picture to happen synchronously when the activity gets hidden and the blur event would happen to late.

beaufortfrancois commented 5 years ago

My concerns with this is that it is very specific to some use cases and wouldn't allow a whole class of use cases to work.

It is specific you're right. However I'd like user to be able to jump on a video call by simply going to a website that would provide the best of PiP: autoPiP.

Ephemeral and seamless are key there and I think getUserMedia() provides a strong signal to allow this kind of interaction. And it is backed up by a user permission prompt already.

Do you have something in mind that would allow this for a whole class of use cases?

Also, I do not know about iOS but on Android, I do not think this could be implemented because the Android API expect Picture-in-Picture to happen synchronously when the activity gets hidden and the blur event would happen to late.

Sorry. I'm not sure I understand this argument. What prevents us to call Android code getActivity().enterPictureInPictureMode(); when Javascript code visibilitychange event listener fires for instance?

beaufortfrancois commented 5 years ago

@jernoble I'd love to hear your thoughts on this Auto Picture-in-Picture behavior we're trying to accomplish.

jernoble commented 5 years ago

I’m inclined to say that pip should always require a user gesture, and that loosening this requirement will lead to a slippery slope of sites with similar use cases demanding the restriction be lifted for them as well.

That said, iOS Safari does have an “auto-pip” behavior when an app is dismissed and video is being presented in full screen. This proposed behavior is pretty similar in effect.

Hypothetically speaking, if we did agree to enable this use case, I can see it working one of three ways:

1) Once gUM() is started, the page can call requestPiP() freely without restriction. 2) Once gUM() is started, the “visibilitychange” event will be considered a user gesture for the purposes of pip. 3) A site can opt into auto-pip on visibility change declaratively.

Option 1) seems ripe for abuse and even purely accidental bad behavior. Option 2) may, as Mounir says, be too late to look correct. Option 3) gives the UA enough information to correctly implement an “auto-pip on visibility change” feature without opening up a security or annoyance can of worms.