WebKit / standards-positions

WebKit's positions on emerging web specifications
https://webkit.org/standards-positions/
240 stars 18 forks source link

Document Picture in Picture API #41

Open liberato-at-chromium opened 1 year ago

liberato-at-chromium commented 1 year ago

Request for position on an emerging web specification

Information about the spec

Design reviews and vendor positions

Anything else we need to know

We'd like to provide the ability for a site to open an always-on-top picture in picture window that supports arbitrary web content, rather than video only. The API proposed in the explainer is approximate, but well within the spirit of what we intend.

litherum commented 1 year ago

The explainer doesn’t indicate the implementation strategy. How would one implement this on iOS?

liberato-at-chromium commented 1 year ago

Unfortunately, I don't know much about Safari on iOS, or the level of integration between the two.

On Android, for example, chrome builds strictly on android primitives, like any other app. Document PiP might be hard to implement there without additional os-level support, since Android PiP doesn't allow input beyond some curated buttons. I have no idea if there will ever be support for it, or if anyone would want to use Document PiP on the form factors that Android typically supports.

However, we believe that there's value in the Document PiP API even if it's only currently supported on desktop.

I guess that's a complicated way to say "I don't know." :)

annevk commented 1 year ago

Is there an actual specification for this feature? It's somewhat hard to infer a processing model from the explainer.

gsnedders commented 1 year ago

It's also not clear to me what the use-cases are of this, and why window.open doesn't fulfil those? The explainer says what but not why.

liberato-at-chromium commented 1 year ago

Thanks for the questions.

Is there an actual specification for this feature?

There's a draft spec at https://steimelchrome.github.io/draft_spec.html which is still under construction.

It's also not clear to me what the use-cases are of this, and why window.open doesn't fulfil those?

We considered window.open as a entry point, but there are a few reasons why we didn't go that way:

othermaciej commented 1 year ago

It's also not clear to me what the use-cases are of this, and why window.open doesn't fulfil those?

We considered window.open as a entry point, but there are a few reasons why we didn't go that way:

  • We wanted an async API. For example, in Chrome we'd like to be able to prompt the user for permission to open an always-on-top window in some cases, and the sync window.open doesn't really support that.
  • We're adding methods to PiP windows that aren't applicable to Windows in general, and it felt like we were overloading Window. Please see DocumentPictureInPictureSession in the above spec for examples. We expect this list to grow over time.

This still doesn't explain what the use cases are - it seems to assume them without stating them. (And use cases should really be in the Explainer).

othermaciej commented 1 year ago

concerns: venue is because this is in a personal repo, not even a CG, so there is no clear IPR policy or governance model.

concerns: use cases is because, as above comments state, use cases are not identified and aren't obvious.

concerns: portability was added because this might not be feasibly implementable on iOS, where the system PIP feature is video-specific (and there isn't arbitrary overlapping window support).

steimelchrome commented 1 year ago

concerns venue: We recently got moved into a WICG repo here: https://github.com/WICG/document-picture-in-picture

concerns use cases: Use cases were added to the explainer. Thanks for recommending that

concerns portability: I'm not sure how to address this. Since the use cases make less sense in a smaller form factor, we weren't too focused on Android/iOS feasibility. Though it could make sense in tablet-size cases. Are there particular API changes you'd like to see or things we could change that would make it more iOS-feasible?

tomayac commented 1 year ago

concerns portability: I'm not sure how to address this. Since the use cases make less sense in a smaller form factor, we weren't too focused on Android/iOS feasibility. Though it could make sense in tablet-size cases. Are there particular API changes you'd like to see or things we could change that would make it more iOS-feasible?

Here's a proposal how to address this: https://github.com/WICG/document-picture-in-picture/pull/1.

hober commented 1 year ago

concerns portability: I'm not sure how to address this. Since the use cases make less sense in a smaller form factor, we weren't too focused on Android/iOS feasibility. Though it could make sense in tablet-size cases. Are there particular API changes you'd like to see or things we could change that would make it more iOS-feasible?

Here's a proposal how to address this: WICG/document-picture-in-picture#1.

How does that change address this?

beaufortfrancois commented 1 year ago

FYI We have published web developer documentation at https://developer.chrome.com/docs/web-platform/document-picture-in-picture/ which may come handy when reviewing the API shape.

beaufortfrancois commented 1 year ago

Re concerns: device independence, we could see mobile operating systems such as iOS, iPadOS, and Android implement support for the Document Picture-in-Picture API technically.

FWIW, We saw web developers using the Picture-in-Picture API for <video> on Android with Media Session actions to control slides on the Picture-in-Picture window.

eric-carlson commented 1 year ago

concerns portability: I'm not sure how to address this. Since the use cases make less sense in a smaller form factor, we weren't too focused on Android/iOS feasibility. Though it could make sense in tablet-size cases. Are there particular API changes you'd like to see or things we could change that would make it more iOS-feasible?

Here's a proposal how to address this: WICG/document-picture-in-picture#1.

This app uses an AVSampleBufferDisplayLayer to display the timer in a picture-in-picture window. This API is designed to display frames of video, so the PiP window has a play/pause button, a a timeline slider, and ±10 seconds buttons on top of the app content.

I don't believe there is a way to disable the media-specific controls, which are clearly not appropriate for arbitrary web content.

liberato-at-chromium commented 1 year ago

In addition to adding support at the os level if there's enough demand, we could also include an option for "non interactive Document PiP" that's a lot closer to the os support we have. This would make it significantly easier for use cases where video-only PiP is being driven by a canvas, since one could use arbitrary web content instead.

It could also be done in a backwards-compatible way with the proposed API, e.g., via a new option to the requestWindow() dictionary to allow non-interactive pip.

eighty4 commented 1 year ago

Does this API support positioning the PiP window from the host document? Looks like it can be moved manually once opened. I was wondering if it could be positioned on open and also repositioned programmatically.

tomayac commented 1 year ago

Does this API support positioning the PiP window from the host document? Looks like it can be moved manually once opened. I was wondering if it could be positioned on open and also repositioned programmatically.

Not currently, but see https://github.com/WICG/document-picture-in-picture/issues/34 for some thoughts on this feature request.

chrishtr commented 1 year ago

Specification here: https://wicg.github.io/document-picture-in-picture/

An intent-to-ship in Chromium is in progress.

eighty4 commented 1 year ago

Where in Chromium is the code for this feature?

steimelchrome commented 8 months ago

Where in Chromium is the code for this feature?

Renderer side: https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/modules/document_picture_in_picture/

Browser side: https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/ui/views/frame/picture_in_picture_browser_frame_view.h, https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/picture_in_picture/

steimelchrome commented 8 months ago

For the initial launch of this API, Chrome disabled the ability for websites to use the resizeTo() and resizeBy() Window APIs to avoid spammy abuse given the always-on-top nature of a document picture-in-picture window. We've received feedback from multiple websites that having access to these APIs would be useful (e.g. clicking to expand a playlist). We propose a change to allow access to those APIs, but gated behind a user gesture (consumed) to limit potential abuse.

See the PR here: https://github.com/WICG/document-picture-in-picture/pull/104

steimelchrome commented 5 months ago

As a small addition, we also propose explicitly allowing Window's focus() API to focus the opener window from the picturein-in-picture window, so that websites can programmatically return to the opener tab. This consumes a user gesture from the picture-in-picture window.

See the PR here: https://github.com/WICG/document-picture-in-picture/pull/109

beaufortfrancois commented 4 months ago

As requested by developers, we proposed adding display-mode for picture-in-picture to CSS Media Queries Level 5.

@media all and (display-mode: picture-in-picture) {
  body {
    margin: 0;
  }
  h1 {
    font-size: 0.8em;
  }
}

See the PR here: https://github.com/w3c/csswg-drafts/pull/9920

steimelchrome commented 4 months ago

Another addition we're proposing is a new boolean parameter disallowReturnToOpener, which defaults to false. When set to true, it hints to the user agent that showing a button in the document picture-in-picture UI that allows the user to return to the opener does not make sense for their use case, so the user agent can hide the button.

Initial request: https://github.com/WICG/document-picture-in-picture/issues/113 PR: https://github.com/WICG/document-picture-in-picture/pull/114, https://github.com/WICG/document-picture-in-picture/pull/116

liberato-at-chromium commented 2 months ago

We're considering a new boolean parameter disallowPositionReuse to give the site control whether the UA should optionally try to remember the previous picture-in-picture window's position and size (false, the default), or place the new window according to its default positioning / sizing heuristics. The idea is that retaining the window's position and size can be confusing if the contents of the new pip window are semantically unrelated to the previous one (e.g., a new video, a new meeting, etc.).

Initial request: https://github.com/WICG/document-picture-in-picture/issues/120 PR: https://github.com/WICG/document-picture-in-picture/pull/119

steimelchrome commented 2 months ago

We're also proposing allowing user gestures in the document picture-in-picture window to be usable in the opener window and vice versa. This makes it more ergonomic to use user-activation-gated APIs, since often event handlers in the document picture-in-picture window are actually run in the opener's context, so the opener's context needs access to the user gesture. This essentially makes the document picture-in-picture window act the same as a same-origin iframe inside the opener as far as user gesture propagation is concerned.

PR: https://github.com/WICG/document-picture-in-picture/pull/117

beaufortfrancois commented 1 month ago

For info, Spotify folks are using the Document Picture-in-Picture API for their Miniplayer. You can learn more about their journey and use cases at https://developer.chrome.com/blog/spotify-picture-in-picture