Add a portal to see currently open windows

johan-bjareholt commented 5 years ago

I am a maintainer of the ActivityWatch project and we are polling the currently focused windows name and title. This is not an issue under Xorg but on Wayland this is a problem as there is no common API between compositors. I have discussed this shortly with both a wlroots and Gnome developer and they both seem to agree that exposing such data would be best solved by adding a xdg-desktop-portal API for this.

Wlroots and KWin already have APIs for this (gnome-shell too but it's disabled by default) but they are all different, so a xdg-desktop-portal API would significantly simplify things.

Suggestions on properties for windows, methods and signals that would be good to have:

Window properties:

app_id (string)
title (string)
focused (bool)

Signals:

FocusedApplicationChanged
RunningApplicationsChanged

Methods:

GetRunningApplications (array of windows)

Links to prior discussions with Gnome and wlroots developers:

ssokolow commented 2 years ago

I think the argument is that sandboxed or not, portals are the most appropriate place to handle things where the user should be prompted for permission and, if there's no portal, then it's going to be either proprietary to each desktop or a universally available hole in Wayland's envisioned security model.

hammerandtongs commented 2 years ago

What would a pull request look like for this issue?

Which existing portal looks most like what "most" people want?

https://github.com/flatpak/xdg-desktop-portal/tree/main/src

My sense is that the people are arguing at length in this thread are the only ones that are motivated to do anything about this but many seem to expect busy people that don't care about this issue to make the pull request...

johan-bjareholt commented 2 years ago

but many seem to expect busy people that don't care about this issue to make the pull request...

No, rather I'm not interested in wasting my time if I know that the PR will be rejected...

rbreaves commented 2 years ago

Which existing portal looks most like what "most" people want?

https://github.com/flatpak/xdg-desktop-portal/tree/main/src

Good question, and I am not sure, I know so little about wayland if I am honest the way it is structured in contrast to X11. But I do agree with @johan-bjareholt that the main developers or contributors behind wayland and this xdg-desktop-portal claiming that this level of interactivity is niche ( @TingPing) or would be leaking too much info out of sandboxes ( @matthiasclasen ) without much, if any thought, about accessibility concerns for the handicapped and use cases where it may be critical to someones work flow, to have dynamic key remaps as a person switching between apps, is concerning to say the least.

I am not a handicapped person, but I have the same need of simplifying shortcuts and workflows that might actually be quite similar to that of a handicapped person that uses various methods of keyboard input. Niche or not, what is being proposed here is very important and would be a serious quality of life improvement for far more people than they are making it out to be. It will absolutely prevent the adoption of rate of handicapped users, and people that require the use of uncommon keyboard input devices that may require additional awareness of what the actively focused application happens to be.

As long as this portal is lacking I do not see any Distro that runs on Wayland being a friendly or usable work environment for anyone that has to use additional software to help them with key input and it be application aware. Many of those users will have to either stick with x11 for the foreseeable future or switch to Windows or macOS.

Going back to read any and all responses to my prior handicapped comment I do see that Matt responded with this previously.

My opinion is that accessibility tooling is system functionality - it should live on the host side, not on the flatpak side.

So I am guessing he is saying we're barking up the wrong tree and need to get it committed up directly into Wayland first in some manner?

https://gitlab.freedesktop.org/wayland/wayland

I mean if we can be pointed in the right direction maybe one of us can start the work of looking at adding whatever protocols needs to be added. If I recall I had someone previously saying some sort of spec or proposal needs to be laid out and written for them for consideration, but for the life of me I cannot recall who or where someone suggested that to me.. if on my kinto.sh github project or somewhere on reddit where I have discussed this issue before as well.

Created an issue ticket here and will see what the response will be like. I think it is better to word this as an Accessibility feature than anything else and Active Focus is equally as good as saying currently open window or windows. https://gitlab.freedesktop.org/wayland/wayland/-/issues/326

johan-bjareholt commented 2 years ago

My opinion is that accessibility tooling is system functionality - it should live on the host side, not on the flatpak side.

@matthiasclasen All use-cases for this feature are not related to accessibility, so I don't think it's as clear-cut. In macOS though it's considered an accessibility API. Regardless, as specified in the top comment I created an issue in the gnome-shell gitlab where they pointed me here. I will ask them again, since it was a few years ago after all.

ssokolow commented 2 years ago

I will ask them again, since it was a few years ago after all.

Bear in mind that I have... strong feelings about GNOME's UI design decisions, so I'd be more likely to maintain a hacky, non-upstreamable patchset for KDE Plasma (once its Wayland support stabilizes enough for me to get my multi-month-long login sessions that I currently rely on kwin --replace and Xorg's stability for), despite my extreme distrust of my ability to write correct C++ than to use a solution that's specific to GNOME Shell, and software I develop/maintain will follow suit in any desktops it supports intentionally rather than by accident.

(As they slowly drift GTK away from being "the universal toolkit" that I'd use to develop "native-feeling apps for my KDE that don't impose KDE dependencies on everyone else" in the GTK+ 2.x era, my attitude toward supporting GNOME grows more in line with the Weird architectures weren't supported to begin with perspective on people complaining about how pyca/cryptography's shift to depending on Rust broke niche platforms that used to work.)

jadahl commented 2 years ago

In my opinion, the kind of API asked for here (practically spy on user activity in the background) should in one way or the other be supported via portals, but I don't think in a traditional portal sense makes sense due to the severity and transparentness of how it'd work. A typical "do you want to fix the app (yes / no)" is not enough here.

Perhaps a better alternative is to make the portal lack the ability to query the user to allow the feature, but instead require these kind of things to be explicitly configured in e.g. Settings. The flow would be similar to how some Android permissions work these days:

Some portal provides a way to open e.g. a "accessibility settings" or "privacy" view. Lets call it the "system settings" portal. The portal backend would launch e.g. some view in gnome-control-center (in the GNOME case)
Another portal "user activity" that exposes e.g. the currently focused application, but without any way to ask to be enabled
Add a way to let applications advertise certain "accessibility" or "privacy sensitive" features e.g. in the .desktop file
Have said Settings application view list these applications in the above mentioned view, where one can explicitly toggle things like "Allow tracking user activity (Yes / No)"
On the first run, the applications explains to the user that they will be presented with an "accessibility" / ... settings, and that they need to find the app, and turn on the tracking ability
The application asks the "system settings" portal to show the accessibility view
If the user decided to go through the trouble of finding the relevant thing to toggle, the "user activity" portal suddenly starts working, otherwise it'll remain non-functioning

Silve2611 commented 2 years ago

In my opinion, the kind of API asked for here (practically spy on user activity in the background) should in one way or the other be supported via portals, but I don't think in a traditional portal sense makes sense due to the severity and transparentness of how it'd work. A typical "do you want to fix the app (yes / no)" is not enough here.

Perhaps a better alternative is to make the portal lack the ability to query the user to allow the feature, but instead require these kind of things to be explicitly configured in e.g. Settings. The flow would be similar to how some Android permissions work these days:

Some portal provides a way to open e.g. a "accessibility settings" or "privacy" view. Lets call it the "system settings" portal. The portal backend would launch e.g. some view in gnome-control-center (in the GNOME case)

Another portal "user activity" that exposes e.g. the currently focused application, but without any way to ask to be enabled

Add a way to let applications advertise certain "accessibility" or "privacy sensitive" features e.g. in the .desktop file

Have said Settings application view list these applications in the above mentioned view, where one can explicitly toggle things like "Allow tracking user activity (Yes / No)"

On the first run, the applications explains to the user that they will be presented with an "accessibility" / ... settings, and that they need to find the app, and turn on the tracking ability

The application asks the "system settings" portal to show the accessibility view

If the user decided to go through the trouble of finding the relevant thing to toggle, the "user activity" portal suddenly starts working, otherwise it'll remain non-functioning

I agree with your approach. Althoug people dislike apple this is what macOS does and I think this solution should be looked at. Windows on the other hand has no restriction at all, which makes it easy for a developer but complicted in sense of data security.

In general Linux should not prohibid apps from appearing because they dislike them. They are used for a reason. In the EU it is now mandatory to track working times and the law has already passend many countries and will be standard in a few years. Linux will only be able to offer a time tracking solution if people use X11 which is bullocks…

I get the sandboxed Idea but we are protecting Linux with cussions. I personally feel as if a am at day care again and someone is telling me what I need to code and what I am not allowed.

That beeing said they should keep in mind that there are people with disabilities that rely on screen readers etc. Ignoring there needs now, will mean the end of Linux in any school, unversity public organization for good.

Placing security over accessibility is highly risky and morally questionable.

PAStheLoD commented 2 years ago

Placing security over accessibility is highly risky and morally questionable.

+1

Not to mention that security as a practice, theory, process or goal to be attained is based on ever finer-grained separation of wanted an unwanted activity. (And that's why naive approaches just pop up the scary box of "this app wants to tinker with the system, do you allow it? if yes, type your password" - which just trains users to type their password into every and all input boxes until whatever they desired happens or they give up.)

Allowing the privilege of getting the app_id & title of the currently active window from the compositor is fine grained. The user (or in case of the majority of users a distribution that offers this as a feature) can control it better, because they can understand it better. (Do I want the "time tracker" to know which window is active? Yes, sounds reasonable. Do I want the time tracker to access all files? No, no thank you. ... contrast this with the old paradigm: Do I want the time tracker to "do whatever root can do"? Huh, the root of what? But sure, if it gets the thing working!)

Without these controls users are faced with a much bigger decision of whether to use something wayland based or not. (Which keeps X more and more alive, which fragments development effort.)

offtopic :/

"In the EU it is now mandatory to track working times and the law has already passend many countries and will be standard in a few years. [...] Linux will only be able to offer a time tracking solution if people use X11 which is bullocks…"

Determining time worked is not the same as spying on employees desktop activity. (For implementation details of the directive see https://ec.europa.eu/social/main.jsp?catId=706&intPageId=5115&langId=en , for example the 14 May 2019 ruling that deals with measurement.)

There are endless ways to implement systems, more and less privacy intrusive, that can offer "time tracking". From a simple webpage with a start-stop button, to arbitrarily complex systems that try to automatically label slices of time as work or non-work (let's say based on network activity, or by looking at keyboard and pointer input patterns, blablabla).

ssokolow commented 2 years ago

...and, as someone with both ADHD and an autism spectrum disorder, and the executive dysfunction that results from that, having the computer keep track of how much time I'm spending on various things with sufficient granularity to be able to distinguish different websites within the same browser based on window titles (eg. YouTube vs. Google Docs) is critical so I can maintain perspective.

It's very much an "assistive technology" for me and one of the big reasons I refuse to use Wayland until this is resolved. It may be a cognitive disability rather than a physical disability, but it's still a disability that I need my computer to adapt to.

I'm just glad that Firefox's MPRIS integration should be sufficient for detecting when YouTube videos are playing in a non-focused window/tab, and logging that effect on my focus.

Silve2611 commented 2 years ago

I can agree on your point with asking for rights. This is what apple does very well. It asks you specifically for the right to read something. It even goes further and asks for specific rights for an application if your read more than the title. This should be the goto approach.

As for the topic of time-tracking. They exist and people choose to use them. In the EU it is not even allowed to monitor the data of your employees so just use one that takes this into account.

Linux should not decide what kind of apps are allowed to exists.

Also this kind of decision is a discrimination which should not be something an os does…

FrnchFrgg commented 1 year ago

First of all this is not Linux deciding things. This is policy from the gnome and XDG projects.

What I find annoying is that the current state of things is that no sane solution is really possible currently because gnome decided to first restrict access to their API to a single caller (xdg-desktop-portal-gnome) «for security reasons» without thinking through any possible replacement or permissions system.

Even root can no longer access that information from a non-sandboxed application, which is good from a privacy point of view but problematic in some other cases. A decision to admittedly increase security for sandboxes is impacting the complete host ecosystem due to missing and unplanned capabilities.

My use case is an equivalent of the Android app «Family Link» which can tell me how long my ten-year-old son used which application. Using ps equivalents is not going to cut it because my son often leaves applications open in the background, especially Ardour and Blender. And in Firefox I want to distinguish between accessing his high school digital workspace to validate finished homework, and playing on Lichess.

I could use a more drastic approach disabling every and all non-approved content (with apparmor and stateful firewall rules even), but currently the contract we agreed on is «trust-based usage, but timed and supervised by adults». My home-assistant tells us when his PC is turned on and we routinely physically look at his screen throughout but he is not our only child and we cannot stay 24/7 in his room.

It is important to have the possibility to detect shorter «non-approved» uses and then confront him with the trust issues that this represent (that is doing our work as parents). I could do so before, and now I cannot after applying an upgrade which I thought would be benign.

Note that my use-case could be seen as not relevant because parental control should work without explicit consent asked from the user. But I think that this on the contrary is part of the contract «for now you accept that we can see that information» and if our son reverts the authorization I can detect it and send a notice anyway.

johan-bjareholt commented 1 year ago

What I find annoying is that the current state of things is that no sane solution is really possible currently because gnome decided to first restrict access to their API to a single caller (xdg-desktop-portal-gnome) «for security reasons» without thinking through any possible replacement or permissions system.

~~This is not true, I was messing around with it earlier today and there are two ways to access the Introspection API. The first thing you mention is exactly what you say, that you need to be the xdg-desktop-portal-gnome application to be allowed to talk with the interface. The second however if you read the source code, is if the org.gnome.shell.introspect option is set to true (and it's set to false by default). Just go into dconf-editor and set it to true and it should work.~~

https://gitlab.gnome.org/GNOME/gnome-shell/-/blob/386d25e6f8ce11549526ea3776eb34138fcb3774/js/misc/introspect.js#L137

rbreaves commented 1 year ago

Silve2611

I agree with your approach. Althoug people dislike apple this is what macOS does and I think this solution should be looked at. Windows on the other hand has no restriction at all, which makes it easy for a developer but complicted in sense of data security.

Hogwash. Companies that use Windows have perfect security. /s :p

FrnchFrgg

A decision to admittedly increase security for sandboxes is impacting the complete host ecosystem due to missing and unplanned capabilities.

And that is the crux of the issue - having features rushed out into production well before they were ready for general consumption. The planning for these sorta use cases are not simply an edge case, even when they get mischaracterized as such imo or people's definition of edge case needs to shrink quite a bit more.

FrnchFrgg commented 1 year ago

This is not true, [...] The second however if you read the source code, is if the org.gnome.shell.introspect option is set to true [...] https://gitlab.gnome.org/GNOME/gnome-shell/-/blob/386d25e6f8ce11549526ea3776eb34138fcb3774/js/misc/introspect.js#L137

Your link is to an older version of gnome-shell. Nowadays the authorization is deferred to DBusSenderChecker.checkInvocation() which you can find in utils.js. It accepts org.freedesktop.impl.portal.desktop.gtk and org.freedesktop.impl.portal.desktop.gnome only (these are arguments passed to the DBusSenderChecker constructor), OR lets the call pass through if global.context.unsafe_mode is true. But that variable, contrary to a dconf setting, is transient only and is not settable from outside of gnome-shell debugger.

(EDIT: According to git blame, the change from a dconf setting to a javascript global variable was made even before the factoring of DBusSenderChecker out of the introspection code, which was already a full year ago).

johan-bjareholt commented 1 year ago

@FrnchFrgg Oh my bad, sorry!

FrnchFrgg commented 1 year ago

To be very clear: I do not think that crippling the host ecosystem capabilities is flatpak's responsibility. And it should not be this project responsibility to "fix" it for gnome, which sends people here to ask for a portal even for non-sandboxed uses that they broke and are supposed to be outside of your scope.

Devising and implementing such an API is very much wanted but, as we can see in this bug, is hard to get right. I think that a lot of the pressure and discontent here comes from the fact that this broke even outside of flatpak sandboxes. Such pressure will not help your contributors to take time and write a system that is secure both on a computer science point of view and from a social engineering point of view.

And it means having people wait on your ability to devote time to a worthy feature request, but not necessary of utmost priority, so that they get a fix for their now broken use case which may be at best tangentially related to flatpak and sandboxed apps.

Silve2611 commented 1 year ago

To be very clear: I do not think that crippling the host ecosystem capabilities is flatpak's responsibility. And it should not be this project responsibility to "fix" it for gnome, which sends people here to ask for a portal even for non-sandboxed uses that they broke and are supposed to be outside of your scope.

Devising and implementing such an API is very much wanted but, as we can see in this bug, is hard to get right. I think that a lot of the pressure and discontent here comes from the fact that this broke even outside of flatpak sandboxes. Such pressure will not help your contributors to take time and write a system that is secure both on a computer science point of view and from a social engineering point of view.

And it means having people wait on your ability to devote time to a worthy feature request, but not necessary of utmost priority, so that they get a fix for their now broken use case which may be at best tangentially related to flatpak and sandboxed apps.

I do not agree that it is hard to do it. It just requires the correct attitude. Mac has already done it for several years so guidelines exists.

At the moment the security approach makes no sense. It prohibids everything pontentially dangerous, like a parent not allowing their children to run as they could fall.

jefferyto commented 1 year ago

From An X11 Apologist Tries Wayland by @faithanalog:

I remember talking at length with the developer of Talon Voice, a voice control/eyetracking tool that works quite well on linux, about the challenges of supporting wayland. The other big thing, aside from how to do input emulation, was whether it was possible to query the list of windows and active focus for context-specific voice commands. I have definitely seen software that does this at this point, since most app panels are their own pieces of software independent of the compositor. And, as you’d expect from them, they can focus windows too. So given that these two problems are solved, it’s now on my list to try and help Talon get Wayland support when I have the energy.

Silve2611 commented 1 year ago

From An X11 Apologist Tries Wayland by @faithanalog:

I remember talking at length with the developer of Talon Voice, a voice control/eyetracking tool that works quite well on linux, about the challenges of supporting wayland. The other big thing, aside from how to do input emulation, was whether it was possible to query the list of windows and active focus for context-specific voice commands. I have definitely seen software that does this at this point, since most app panels are their own pieces of software independent of the compositor. And, as you’d expect from them, they can focus windows too. So given that these two problems are solved, it’s now on my list to try and help Talon get Wayland support when I have the energy.

Well how is it done. We have all been talking about the fact that we want to get window information about a certain windows but cannot do it. Does he have a specific setting? Has something changed? How can the window title be received?

hyuri commented 1 year ago

AutoKey is another example of a very powerful app that requires knowing which app/window is in focus when a global shortcut is triggered.

faithanalog commented 1 year ago

From An X11 Apologist Tries Wayland by @faithanalog:

I remember talking at length with the developer of Talon Voice, a voice control/eyetracking tool that works quite well on linux, about the challenges of supporting wayland. The other big thing, aside from how to do input emulation, was whether it was possible to query the list of windows and active focus for context-specific voice commands. I have definitely seen software that does this at this point, since most app panels are their own pieces of software independent of the compositor. And, as you’d expect from them, they can focus windows too. So given that these two problems are solved, it’s now on my list to try and help Talon get Wayland support when I have the energy.

Well how is it done. We have all been talking about the fact that we want to get window information about a certain windows but cannot do it. Does he have a specific setting? Has something changed? How can the window title be received?

I looked into it in more depth after writing that blog post and what I found is the software I had in mind (rofi) is using a wlroots extension called "wlr foreign toplevel management" for both getting the list of windows and focusing windows.

https://wayland.app/protocols/wlr-foreign-toplevel-management-unstable-v1

https://github.com/lbonn/rofi/blob/wayland/source/modes/wayland-window.c

I tried to figure out if there are gnome/KDE equivalents but wasn't able to find anything (this would have been half a year ago now, I think). That said I'm not an expert in wayland protocols, so perhaps someone else knows of something for gnome or KDE.

johan-bjareholt commented 1 year ago

@faithanalog The original post I wrote 4 years ago links to a reddit thread about just wlr-foreign-toplevel-management.

Unfortunately no one else than sway/wlroots wants to use that protocol, because it allows any Wayland application to read app names and titles which is a privacy risk.

Mikenux commented 1 year ago

Hello!

About leaking specific private data, it may be enough, when the app requests access to window tracking, to warn:

That some window titles may contain information that may be considered private;
If the app has means to share this information (e.g. inter-process/app communication, network access).

Another way is to have this access if the application cannot communicate with other applications and cannot access the network unless access is granted. For this, maybe it is possible to save the data in specific files known by flatpak (specific filenames, file format?) and for which only these are shareable (with the same app for synchronization or parental control, over the network)?

yuhldr commented 1 year ago

App calls sensitive permissions, warnings, enough.

I want convenience and freedom, but if it's really for privacy, why should I use Linux?

Accessibility features can also be called up by developers on Mac

Mikenux commented 1 year ago

@jadahl

Is your proposal that the app wanting to track windows tells the user that it has privacy sensitive features and then open the privacy view in gnome-control-center (automatically or manually with a button when the app tells the user about its features?), and finally that the user should click on the row of the app and enable "Track User Activity"?

If so, that's a bit too much, as users will likely find the app and enable the setting for it. It might just annoy them, especially since they just want to use the app.

Besides, what's the point of doing that? Prevent users from automatically clicking "Allow" on dialog with "Allow" and "Deny" responses?

Mikenux commented 1 year ago

I got the answer by re-reading one of your comments on the accessibility portal issue.

So, I think a new design is needed to avoid the user automatically clicking an Allow-type response, but without annoying them with multiple steps.

Going back to the privacy aspect, as I said before, it is important to warn about the potential sharing of private data via internet access or communication with other processes (i.e. tell the user if there is this potential sharing and also when the app does not have this sharing capability). There are two solutions for this:

Tell the user and show them again the UI that asks to allow tracking if the app goes from having no means of sharing to having them (i.e. managing the permission change).
Require these apps to have no means of sharing and use a private data sharing portal (i.e. private data sharing is explicit and the app cannot have any internet access nor any inter-process communication). Such a portal, which does not exist, will require storing data in known files (to indicate to the user what data will be shared). Such a portal is perhaps more interesting in the long term (if it may exist).

I think this is very important because:

knowing if the app can potentially share private data is more important than knowing if it collects such data.
users will want to use the app they have installed, so will probably accept their activity being tracked anyway as the app will not be (fully) functional without this acceptance.

ssokolow commented 1 year ago

Any security expert will tell you that you're setting up a "false sense of security" situation.

Exfiltrating data can be accomplished in all sorts of non-obvious ways. For example:

Drop it onto the clipboard briefly and then restore the old contents, synchronized to a time interval some other process will be listening on. (Tools like JDownloader's clipboard monitor and various Win32 tools for hooking text-drawing calls in Japanese visual novels and feeding them into translation tools on-the-fly are examples of legitimate uses for un-prompted clipboard interaction.)
Save it in a known location on-disk that some other process will monitor for updates
If using X11 or an XWayland setup where multiple applications share the same XWayland instance, store it in X11 window properties another process can watch and read. (The threat posed by combining an arbitrary code execution vulnerability with access to X11 is why socket=fallback-x11 exists for Wayland capable applications.)
Write it into the metadata (i.e. application name, stream description, etc.) of a PulseAudio connection for something else to read out using the API for writing audio mixer GUIs. (One of the classic ways to persist client-side state before modern HTML5 APIs was taking advantage of how much you can store in the window.name property in web browsers.)
Engage in a "confused deputy attack" and feed it to something else with a vulnerability or misconfiguration which is authorized to do what you aren't. (This is why access to D-Bus must be limited as far as possible without breaking an application.)
Hide it in the metadata of the user's next print job to be retrieved by some external tool. (Or, if you really want to get CIA-level, do what every color laser printer already does with things like the printer's serial number and steganographically encode it in the printed pages... color laser printers either print patterns of yellow dots encoding their serial number and various other parameters intended to help the FBI track you down if you try to print money or use some other kind of forensic tagging.)
Make use of any device the program has access to via the current need to enable device=all for access to webcams and joysticks/gamepads.
Wait until the system is idle in a pattern that suggests the user is AFK or asleep and then do anything more visible that an external helper can watch and cover your tracks on. Perhaps streaming accumulated data into desktop notifications, depending on how the notification host is implemented.
Hide the data inside the parts of structured files that aren't displayed WYSIWYG. (eg. custom chunks in PNG files, custom metadata fields inside JPEGs, custom files inside Zip-based formats like ODT, OOXML, EPUB, etc., custom data blocks inside stream container formats used in things like MPEG-family formats, Ogg, Matroska, Quicktime, ASF (WMA/WMV), etc.)

...or, with everyone feeling so confident about their security, go the xkcd 538-ish route and show the reader a desirable feature that seems to be legitimate evidence that the security model is getting in their way and needs to be circumvented for non-malicious reasons. (Stuff like how I've already seen some applications like keypress visualizers for screencasts going straight to "Either use an X11 session or grant non-root users access to your keyboard's evdev device node" for want of a Wayland-level API.)

For example, I could easily see a personal time tracking tool encouraging the user to circumvent this so the desktop and mobile versions of the app can synchronize records to produce unified reports.

Heck, last time I checked, your proposal would, by its very nature, prevent the core function of the time-tracking software used by some online contractor marketplace services (I know oDesk had one when it was called Upwork), where the dynamic is "If you don't let it watch what you're doing when the timeclock isn't paused and upload the results to the server as a fraud-prevention measure, you don't get paid".)

We already see people discussing how to circumvent sandboxing in order to get Flatpak'd/Snap'd browsers and Flatpak'd/Snap'd password managers talking to each other while we're still waiting for a WebExtensions portal.

Forcing a "this or that but not both" permissions situation on users and application developers is a bad idea.

Mikenux commented 1 year ago

... It is about telling the app sharing the data from it, I don't see why it is giving a false sense of security. The reference point is the app, not external processes to it. If there are external processes watching or taking this data, it isn't relevant from the app that it is sharing the data. The app can store the data in metadata or in structured files, but it must know the other processes to use them to retrieve this data, no? Other permissions can be took into account for sure: that will be only remembering the user if the app is sandboxed or not, and any phrasing can be improved.

Otherwise, some areas can be certainly improved (data transmission over D-Bus, PulseAudio connection, device access, etc.), but it is not like I said to take it as is: those are only options, which are both discussable to know what to do in details.

And, sharing the data over the internet is not excluded in option 2: That's starting with a sandboxed app, then allow it to share the data over the internet (the "private data sharing portal"). Using a portal already generally means not using permissive permissions (e.g. generally, using the file chooser portal means not using filesystem=home). If this connection is mandatory, something then must be done to tell it appropriately.

ssokolow commented 1 year ago

I'm referring to option 2 for two reasons:

First, it's not feasible to retrofit "Require these apps to have no means of sharing" because there are so many APIs the things have to interact with and bad guys only need to find one of them... and they don't need to convince security auditors... they just need a solution that the average user won't recognize as a path for data exfiltration.

Java wasn't even retrofitting to the degree this is and it still had a couple of decades of applet security whac-a-mole before Java applets were finally retired.

Things like JavaScript runtimes and WebAssembly can pull it off because they design their APIs from scratch to be simple enough to be audited. Equally importantly, they take a "sandbox first, functionality second if compatible" approach... an approach that, when applied to non-web applications, produces WASI, not Flatpak and Portals.

Second, requiring people to do their file access entirely through special portals to get access to the monitoring API is reminding me of what I said recently regarding the idea of an xdg-pip Wayland extension. If you make your solution too onerous and restrictive, nobody will use it.

It's already hard enough to get applications to switch away from legacy permissions to portals and, as I said, GNOME's vision of Wayland is already driving application developers to circumvent the security model entirely to deliver the features users want.

In this case... probably by asking people to enable whatever accessibility APIs wind up being required to provide assistive technologies for legally recognized disabilities and then requesting "I'm a screen reader" permissions to access the relevant information about the currently focused window without having to give up legacy/manifest file permissions... and, if you try to require accessibility apps to be that locked down, you might wind up with some kind of accessibility-bridge package which exists only to proxy the APIs onto a bus outside the sandbox so people can use them for things like Linux AutoHotKey clones.

Mikenux commented 1 year ago

In option 2, I'm talking about a sharing portal, which implies that the application, to use it, has no other way to share the data. If the app has permissive permissions to share data, there is no point in having a portal to share data, because a portal is built to replace the permissive permissions (including those that can be used in deviant ways ).

Alternatively, there's option 1, where it's about informing the user about the potential leak of private data from the app and asking the user again to grant permission if the app comes with more permissive permissions with an update.

Having one, then the other, depending on how sandboxing evolves over time, is also an option.

ssokolow commented 1 year ago

And my point is that "the application, to use it, has no other way to share the data" is an untenable position to enforce unless the entire API surface of the sandbox has been designed around it, the way something like WASI has, and attempting to enforce it will just imply to users that it can be done in a reliable manner.

johan-bjareholt commented 11 months ago

There is a new wayland protocol in the staging section called ext-foreign-toplevel-list[1] which allows clients to get all windows as well as their appid+title. This has of now only been implemented in a draft commit for the cosmic DE[2], hopefully more will follow. It has a lot of similarities with wlr-foreign-toplevel-management[3], but is more limited as it only shows all windows, and to be able to see which window is focused there is yet another protocol called foreign-toplevel-state[4].

What is more convincing about these two protocols compared to wlr-foreign-toplevel-management is that they have an intention to get them into wayland-protocols. There are still two big drawbacks however that still makes it unlikely that we will be able to use these two protocols anytime soon. First is simply that most compositors probably won't implement them. Secondly is that these protocols will only be accesible from so called "priviliged" clients. Exactly how to make a client "priviliged" will depend on the compositor and that is yet another big discussion.

Regardless, I am happy to see that at least something is happening in the wayland ecosystem in regards to this functionality.

[1] https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/main/staging/ext-foreign-toplevel-list/ext-foreign-toplevel-list-v1.xml [2] https://github.com/pop-os/cosmic-comp/pull/76 [3] https://github.com/swaywm/wlr-protocols/blob/master/unstable/wlr-foreign-toplevel-management-unstable-v1.xml [4] https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/196

ssokolow commented 11 months ago

Exactly how to make a client "priviliged" will depend on the compositor and that is yet another big discussion.

Regardless, I am happy to see that at least something is happening in the wayland ecosystem in regards to this functionality.

I'm especially happy to see that someone's finally moving on the concept of privileged clients. That was promised over a decade ago as how the original Wayland concept would allow things like display control panels to not have to be reinvented as an in-process part of every new compositor.

phoerious commented 11 months ago

There are still two big drawbacks however that still makes it unlikely that we will be able to use these two protocols anytime soon. First is simply that most compositors probably won't implement them. Secondly is that these protocols will only be accesible from so called "priviliged" clients. Exactly how to make a client "priviliged" will depend on the compositor and that is yet another big discussion.

That's a pretty heavy limitation for such a fundamental feature.

pktiuk commented 3 months ago

@matthiasclasen
Lack of this API really hurts the functionality of Wayland for many use cases.
This is the most demanded feature in this repository, which shows, that there is a real demand behind this request.
It can be done securely and with respect to privacy. There should be an acceptable middle ground.

flatpak / xdg-desktop-portal

Add a portal to see currently open windows #304

offtopic :/