WebKit / standards-positions

WebKit's positions on emerging web specifications
https://webkit.org/standards-positions/
242 stars 19 forks source link

Additional Windowing Controls #96

Open ivansandrk opened 1 year ago

ivansandrk commented 1 year ago

Request for position on an emerging web specification

Information about the spec

Design reviews and vendor positions

Introduction

This proposal introduces additional ways for web applications to introspect and control their windows, to enable critical window management functionality on the web platform.

The proposed enhancement would allow web applications to maximize, minimize, and restore their windows and introspect that window display state. Further, it allows applications to be notified when the window is repositioned, and control whether the window can be resized. The window placement permission will be required for these capabilities.

Problem and Use Cases

Virtual Desktop Infrastructure (VDI) web clients have limited abilities to integrate remote application windows with the local desktop environment, which creates suboptimal experiences for their users. Currently, they can only present full disjoint remote desktop environments (e.g. in a local fullscreen window), or present individual remote applications in separate local windows with titlebar window controls that are inoperative, redundant, and confusing for users.

Several VDI clients, including Citrix and VMWare, have reported that their users strongly desire an integrated experience as their work is increasingly happening both on local client and remote host devices.

Unfortunately, the web platform offers no means for web applications to signal the user agent when users interact with remote application (or custom) window controls. Further, web applications cannot introspect the local window’s display state, and must poll for local window position changes. These platform gaps create disconnects between local and remote windows, and prevent web VDI clients from offering functionality expected by users.

These missing capabilities also prevent a broader set of web applications from offering compelling window management experiences for their users.

Proposal

This proposal seeks to enable local web applications to convey a user’s intended window control interactions with remote (or custom) window controls. Summary of the API proposals, which are generally gated by Window Management (“window-placement”) permission:

Use cases explored here also rely upon a parallel effort to standardize the existing CSS property -webkit-app-region/app-region.

marcoscaceres commented 1 year ago

At first glance, I think we would likely be opposed to these.

Even with a permission, things like minimize, maximize, restore, are generally in control of the user, and handing control of those to application would potentially lead to abuse (recalling pop under windows via abuse of window.focus()). That is, definitely wouldn't want those methods to be generally exposed to the web at large. Is there no way for the UA (knowing that it is in VDI environment) to handle those?

Similarly, exposing things like displayState might be a privacy concern if it exposes that the web page is running in a VDI environment.

With respect to window sizing, that also seems to have accessibility concerns: it would be quite annoying and confusing if windows started resizing themselves (or where able to override user preferences).

About moving the window and general positioning information, you are right about polling... but wouldn't it be better to add event handlers on Screen: it already tells you the window offsets, so instead of polling, it would probably be ok to fire an event after the window has been moved by the user.

marcoscaceres commented 1 year ago

Had a good chat with @ivansandrk and @michaelwasserman about their proposal. They will send a followup response to my comments above and we should come up with a position from there.

michaelwasserman commented 1 year ago

Hey Marcos, thanks for considering this proposal and for taking the time to meet us.

We’d like to add a big disclaimer that these APIs are for a specific class of applications running in particular display modes, not for the drive-by-web in a tab. We’ll make that more explicit in the Explainer, which just briefly mentions that limitation:

window.minimize()/maximize()/restore() are limited to: Windows with ‘standalone’ or ‘minimal-ui’ display-mode, i.e. installed application windows, and origins running in their own window, whether installed or not, e.g. Chrome’s “Create shortcut…” with “Open as window” checked Popup windows created by script, i.e. window.open()

We’d also like to address your points above:

Even with a permission, things like minimize, maximize, restore, are generally in control of the user, and handing control of those to application would potentially lead to abuse (recalling pop under windows via abuse of window.focus()). That is, definitely wouldn't want those methods to be generally exposed to the web at large. Is there no way for the UA (knowing that it is in VDI environment) to handle those?

For certain, these should not be exposed to tabbed pages on the web at large. This functionality only pertains to a specific class of applications in particular window display modes that require user opt-in and a degree of trust. We envision matching or enforcing stricter requirements than what’s already supported for Window.moveTo() and Window.resizeTo().

There’s no good way for the UA to handle those controls alone. There are all kinds of heavily modified title-bars that can’t be easily represented. We considered specifying button regions using CSS properties, but selecting a remote-app menu option like Window>Minimize would be tricky, and pressing a hotkey like CTRL+M to minimize the remote window would be impossible.

We’re open to suggestions for enhancing user protections beyond permission and display-mode prerequisites.

Similarly, exposing things like displayState might be a privacy concern if it exposes that the web page is running in a VDI environment.

The web page itself is the VDI client, presenting a remote host device’s application window in its content area. Exposing the window displayState to that VDI client page doesn’t reveal anything novel about the VDI configuration. displayState is added to explicitly denote whether the local window is minimized, maximized, or fullscreen, or not, to help keep local and remote window states in sync.

With respect to window sizing, that also seems to have accessibility concerns: it would be quite annoying and confusing if windows started resizing themselves (or where able to override user preferences).

Sites can already move and resize popup and standalone windows, scripts can even polyfill non-resizable by calling resizeTo() with clamped dimensions in response to window resize events. UAs already clamp those requests for min/max sizes and on-screen bounds. Additional accessibility enhancements for this space would be great!

The proposed resizable flag doesn't add much new here, other than avoiding the scripted resizeTo hack above, but accessibility review will be important here.

It might be possible to replace window.setResizable() with a flag specified during window creation (e.g. via window.open()’s windowFeature string or via manifest property), as mentioned in our chat. We’ll investigate that possibility and document our findings!

About moving the window and general positioning information, you are right about polling... but wouldn't it be better to add event handlers on Screen: it already tells you the window offsets, so instead of polling, it would probably be ok to fire an event after the window has been moved by the user.

The window object exposes its bounds via attributes and a resize event handler. Adding a window move (or bounds change) event handler on that same object seems most appropriate. The screen object exposes its bounds separately. Another API introduced a screen change event handler to obviate the need for polling screen metrics changes. This proposal keeps semantic alignment between those distinct window and screen concepts.

Our planned followups, from this thread and our chat:

We look forward to your next thoughts and will circle back as the Explainer evolves.

marcoscaceres commented 1 year ago

Given that this is designed for a specific class of “Virtual Desktop Infrastructure“ application that relies on web technologies - and not the “drive-by web“ - I believe WebKit should take a neutral position here. Thanks again for clarifying that being the case above. The Explainer should definitely make that more clear in the Introduction and thanks for agreeing to update that!

The concern remains that it could set precedence for controlling windowing aspects, so we again ask that careful and thought is given to design and how these capabilities are exposed (that’s not to say that care hasn’t been taken - just want to reinforce the point is all). I'm adding concern integration and annoyance, but based on the discussions I know you are aware and there there might not be ways around it.

Unless anyone from the webkit project objects, I'll set this to "neutral" in a week or so.

Personally looking forward to seeing updates and where you end up with the proposal. It will be interesting to see if there are feature enhancements you can bring back to the drive-by web from this.

sonkkeli commented 11 months ago

Hey, I'm jumping in here as I'm continuing on Ivan's work.

There were some updates made on the proposed APIs and the current proposals are at least addressing the concerns to gate the feature behind a permission and using the CSS media queries for display-state and resizable. The (up-to-date) proposals for the APIs would be the following:

Additionally the JS APIs are proposed to be gated behind the existing window-management permission.

The explainer has also been updated with the new API proposals.