flatpak / xdg-desktop-portal

Desktop integration portal
https://flatpak.github.io/xdg-desktop-portal/
GNU Lesser General Public License v2.1
551 stars 186 forks source link

Notifications Portal v2 proposal #983

Open jonas2515 opened 1 year ago

jonas2515 commented 1 year ago

Quick introduction for those who don't know me: I'm Jonas (aka verdre) from the GNOME community, mostly working on gnome-shell/mutter but also across the stack to get things ready for phones.

The notifications portal is currently of very limited use for apps with more advanced notification needs, especially with smartphone use-cases in mind.

Here's a proposal for a new version of the notifications portal, trying to kickstart development and open room for discussion.

Things we want

Things we don't want

Proposed API

Permission requests

Content Hints

Used for all notifications that need specific handling, eg.

There are three different namespaces that any hint must adhere to:

Requests and Responses

Requests and responses are the new API for actions.

Requests have an application-defined requestId (s) and requestParams (a{sv}).

The response object is passed to the responseAction GAction as the parameter (format ssa{sv}: notificationId (s), requestId (s), responseParams (a{sv})). The responseParams are usually empty and only set when using a preset request ID.

Preset request IDs

There's special request IDs for common requests in the preset namespace, proposed IDs (standardize with the protocol?):

Presentation hints

Presentation hints allow configuring specifics of how a notification should be presented. The system may ignore those depending on its policy.

Supported presentation hints are listed in presentationHints field returned by GetCapabilites().

Standardized hints:

Custom methods of alerting

Apps may use custom methods of alerting the user, for example to play audio from a special source like an online stream of a radio station.

DBus API

Methods:

Open questions

How would I do x with this?

Create and ring an alarm clock 1) Create "next alarm" notification a few hours before alarm - Set content hints to `class.alarm` - Add `preset.cancel` request to allow removing alarm again - Set `presentNonDisrupting` presentation hint so that the notification silently shows in the tray 2) When alarm rings, remove old notification and create a new one: - Set content hint to `class.alarm.ringing` - Add `preset.done` and `snooze` requests - Set `isPersistent` to true - In case a window of the app is already open and focused, set the `dontShowBanner` presentation hint 3) On response: - When `preset.done` is invoked, remove notification - When `snooze` is invoked, remove notification and go to 1)
Notifying an incoming call 1) Create the notification - Set content hint to `class.call.incoming`, this magically makes the system: - Style the notification banner in the "incoming call" style - Repeatedly play a ringtone and vibration pattern - Add `preset.accept` and `preset.decline` requests - Set `isPersistent` to true - In case a window of the app is already open and focused, set the `dontShowBanner` presentation hint 2) Handle responses and missed call - When `preset.accept` gets invoked, accept the call and update the existing notification - Change the content hint to `class.call.ongoing` - Replace requests with `preset.cancel` and `preset.call.toggleSpeakerphone` - When `preset.deny` gets invoked, deny call and remove notification - When the user didn't trigger a response until the the call was missed, update the existing notification - Set content hint to `class.call.missed` - Set `isPersistent` to false - Replace requests with `preset.call.returnCall`
ZanderBrown commented 10 months ago

It makes the portal depend on GLib and GIO. If I was implementing this portal it would be written in Rust, use Rust libraries for image processing, and not depend on GLib or GIO. A well-written specification will not require me to reverse-engineer a GIO implementation detail.

That format is not part of GLib’s API, so you can’t depend on it implicitly in this spec — you’d have to specify the format explicitly in the spec, or point at another spec for icon interchange.

I thought about writing the bigger explanation comment but instead I think I'm going to go with: This is quite funny

ZanderBrown commented 10 months ago

Okay so that wasn't helpful, apologies — but some observations: If you don't want to know about GIcon, you already can't implement portals, so that's rather moot, as, unless we want to break compatibility, a backend already has to handle at least the encoding of GBytesIcon and GThemedIcon to support DynamicLauncher and …Notification.

Fortunately we do only need to care about GBytesIcon and GThemedIcon, and we nicely validate that it's one of the two before forwarding to the backend — which brings me to: There has been various talk of attack surface and ‘deserialising the GIcon could be bad’, which, yes, a totally fair point, no arguments there on the whole. There are also questions around how big could a GBytesIcon end up and that it doesn't actually imply the bytes are any particular format, which is unfortunate.

https://github.com/flatpak/xdg-desktop-portal/blob/83953d2885a17cda11109e03ce2401324d7b470d/src/notification.c#L266-L291

Whoops.

The cat is out the bag, the train has left the station, the horse bolted, and the proverbial thingamabob has thingamajigged.

What and how we want to react to that, idk, but it's happened.

All of which further makes the ‘not public API’ part a tad moot, even if GIcon wasn't supposed to be public (a point that doesn't seem entirely in harmony with the docs and related discussion?) — we have several components depending on it's stable interchange over the bus between different versions of GIO.

I would note though the docs do make it clear the encoding is only valid within a given ‘file system namespace’ — which would be why GFileIcon isn't supported, and would need to be loaded into a GBytesIcon before transmission (something which the GIO portal notfication backend could do transparently, but apparently currently does not?), since the sandbox makes paths unworkable.

ilya-fedin commented 10 months ago

That is double funny given that GNotification doesn't support GBytesIcon for FDO notifications so you can't support both portal and FDO if you need to present some non-icon image at notification (like avatar)

jsparber commented 10 months ago

I will work on this as part of the GNOME STF grant

DemiMarie commented 5 months ago

Have the concerns regarding chat apps and persistance been addressed? This is a feature present in Windows and Android at the very least.

Mikenux commented 4 months ago

I'm replying to you here.

Context: First two comments in https://github.com/flatpak/xdg-desktop-portal/pull/1298

I'm still for sound being tied to a notification class. This defines a notification class that users can enable and disable. This is especially relevant if you want to set a maximum sound duration (which is a good idea), because the duration of an informative notification is not the same as that of the sound of a custom ringtone or of an alarm.

As a side note, for alarms, maybe applications should make a single request with all relevant alarm parameters to one alarm portal rather than multiple portals (if I'm not wrong). Also see https://github.com/flatpak/xdg-desktop-portal/pull/1098#issuecomment-1957728367 on your other MR.

Regarding custom sounds versus system sounds, it's more about what app developers can expect. Setting a custom sound can be part of the app experience. App developers can therefore expect all DEs to use this custom sound. Furthermore, having a custom sound linked to a notification class (yes, again) here allows the user to define a system sound or a custom sound for this notification class (changing the sound when the app is not running).

jsparber commented 4 months ago

I'm still for sound being tied to a notification class. This defines a notification class that users can enable and disable. This is especially relevant if you want to set a maximum sound duration (which is a good idea), because the duration of an informative notification is not the same as that of the sound of a custom ringtone or of an alarm.

I think it makes sense to link the sound to content hint but that's tangential to what #1298 implements so far.

Please also note that app developers, with #1298, can also specify themable sound names , which are system provided sounds. (just noticed that i didn't add that to the docs)

DemiMarie commented 4 months ago

@jsparber would it be possible to ensure that it is at least possible, in the future, to support invoking actions on notifications even after the app has exited, and possibly even after a system reboot? In other words, would it be possible to provide each notification with an identifier (such as a 256-bit random number) that will never be reused?

jsparber commented 4 months ago

@jsparber would it be possible to ensure that it is at least possible, in the future, to support invoking actions on notifications even after the app has exited, and possibly even after a system reboot? In other words, would it be possible to provide each notification with an identifier (such as a 256-bit random number) that will never be reused?

I think there was some confusion about according to this proposal notifications will have an id specified by the app, so the app has to provide an id which will only exist in the context of the app and may be unique.

DemiMarie commented 4 months ago

@jsparber would it be possible to ensure that it is at least possible, in the future, to support invoking actions on notifications even after the app has exited, and possibly even after a system reboot? In other words, would it be possible to provide each notification with an identifier (such as a 256-bit random number) that will never be reused?

I think there was some confusion about according to this proposal notifications will have an id specified by the app, so the app has to provide an id which will only exist in the context of the app and may be unique.

Ah, nice! I suggest adding something like this to the spec

The last bullet is to ensure that conforming applications are at least ready for when it becomes possible for an action to start up an app.

Mikenux commented 4 months ago

@jsparber: Sorry, I'm confused about what you replied in your comment on your MR and what you replied here. So, does a content hint must be specified? Or does a notification with a sound but no content hint belong to "general"/"information" notifications?

Also, can we already know what content hints will be present?

ilya-fedin commented 4 months ago

The app ID is needed by most portal interfaces. I propose to have some global SetAppId method for unsandboxed apps that would set the dbus connection <--> app id relation for all the portal interfaces. This would let to not to add appid argument to the methods of each and every interface.

DemiMarie commented 4 months ago
  • important: Marks the content as important. This might make the system present it differently, for example ignoring the Do Not Disturb setting. May only be used if explicitly requested by user, for example when marking a contact as important. Anything else that the app considers important must use an appropriate class hint instead.

If this is going to be in the spec, then I think certain class hints need to imply important, at least if the app has the permission to use important at all. The reason is that in some cases (such as a weather app reporting a tornado warning), it is necessary to override Do Not Disturb for rather obvious safety reasons!

ZanderBrown commented 4 months ago

@DemiMarie I think you've perhaps not quite followed?

The notification ID is a string provided by the app, and is scoped within an app-id, and what an app does with that is entirely up to the app.

For something like ‘System Updates Are Available’ you have absolutely no need of uniqueness: The response to the notification being activated will always be to open the system update page, and if for some reason the notification was already present you new one should replace it, not duplicate — so the id can simply be system-update.

What really matters is that the id somehow communicates to the app what it should be doing in response to this interaction, and that'll vary app to app — for example a matrix client might use something like mention-[event-id] when the user is mentioned, and then app can simply split the event-id off and jump to that message, a scheme which would neatly give you free deduplication even.

A ‘cryptographically secure random number’ would be rather contrary to this goal, in most cases?

ZanderBrown commented 4 months ago

would it be possible to ensure that it is at least possible, in the future, to support invoking actions on notifications even after the app has exited, and possibly even after a system reboot?

The spec already supports that, that's part of the core design even

DemiMarie commented 4 months ago

@ZanderBrown Ah, I see!

In the Matrix case, one would want to have something like matrix-[account_id]-[event_id] for multi-account support.

AdrianVovk commented 4 months ago

Hello :wave:!

Tiny bit of motivation for this comment: I'm working on integrating systemd-homed with GNOME as part of the GNOME STF grant. Recently, I've started planning out what GNOME will be doing to provide a lock-screen for users locked via homed: a normal lockscreen cannot work because the user session is frozen, so we need to run a lockscreen outside of the session to facilitate homed's security. This, however, has the potential of breaking apps that may want to appear over top of the lockscreen, which I wanted to avoid. So, I researched the use-cases for apps-on-lockscreen in other OSs, and how the functionality is implemented there. This ended up having a lot of overlap with notifications: incoming calls, ringing alarms, etc are the common use-cases for apps appearing on the lockscreen. My full research can be found here, though I will cover the relevant conclusions in this comment.

I've discussed many of these things with @jonas2515, and we seem to be on a similar page about them all. I want to write my ideas down here, to make sure the results of our discussion aren't lost and to collect feedback about it from other projects.

Calls

On Android, incoming calls are just apps that put themselves on the lockscreen. On iOS, there's a "CallKit" API that lets apps tell the OS about incoming and ongoing calls. The OS, in turn, can provide a rich incoming call UI, integration with the shell, cross-app call waiting, etc. You can find lots of more detail in my research document. Turns out, this Notifications v2 API almost fully covers the needs of CallKit, so here's a wishlist of things still lacking that need to be in the API spec, or recommended to backends, or etc:

Alarms / Timers / Calendar Events / Reminders

On Android, alarms work similarly to incoming calls. iOS doesn't seem to expose an alarms API. Here's the wishlist of what we'd want for alarms in this API:

Text-to-speech

I haven't discussed this w/ @jonas2515 but I think it's a decent idea. Basically, some special kinds of notifications (incoming calls, ringing alarms, calendar events, emergency alerts) could benefit from having a text-to-speech readout of what they're about. I know lots of people who keep their phone across the room, and having the phone yell out "Incoming call from James Smith" every 10 seconds during the ringing is helpful to decide if it's worth getting up to pick up the phone. Here's some examples from my experiences that may be worth copying somehow:

Apps should be required to opt-into this. That way, apps can turn on this functionality on a per-alarm or per-event basis. The notifications API will need to at least be aware of this functionality because it's ultimately in charge of the sounds coming out of the device in response to a notification

Note that this is separate from accessibility. We notably shouldn't use this TTS functionality for people using a11y tools because there's often a better pattern to handle that provided by the a11y backend

jsparber commented 4 months ago

@AdrianVovk Thanks for your input

Calls

I think displaying full screen notifications fit pretty well with the direction the content type property goes in. I think rising an app on top of the lockscreen can also work but it's gonna take a while to get to that point :)

Text-to-speech

Interesting idea, apps should be able to implement that via setting a custom sound file. Obviously needs a lot of work on the app site. But in future it may be interesting to explore :+1:

Mikenux commented 4 months ago

Question: For themes, why are multiple icon names defined if only one is used?

logiclrd commented 1 month ago

Coming to this discussion a bit late, can we consider adding support for newlines in notification body text? In the current implementation, newlines are flattened to spaces. It makes sense for a collapsed one-line view of a notification, but there are cases where some simple formatting can go a long way to improving readability. This doesn't lead directly to large notifications that use a lot of space -- I've seen mention previously of limiting expanded notification bodies to 6 lines of text and that doesn't seem unreasonable.

Where I'm coming from:

I want to display a notification along the lines of:

Upload of the following file failed:

/path/to/file

This may result in consistency issues.

With the current implementation, it's a single-line river of text:

Upload of the following file failed: /path/to/file This may result in consistency issues.

logiclrd commented 1 month ago

I have just this morning learned about the existence of the "Whiteboards" project. Would it be appropriate to open an issue there describing this request?

ilya-fedin commented 1 month ago

Personally I haven't explored the code but if it really strips new lines the it's going to be very annoying. I believe this should be decided on shell level while portal should relay the text just as is.

logiclrd commented 1 month ago

@ilya-fedin I believe this is the offending section:

https://gitlab.gnome.org/GNOME/gnome-shell/-/blob/main/js/ui/messageList.js#L546-551

This control is used by the notification code, which creates a subclass called GtkNotificationDaemonNotification:

https://gitlab.gnome.org/GNOME/gnome-shell/-/blob/main/js/ui/notificationDaemon.js#L449-456

You can see this in action on currently-released Ubuntu versions. In one terminal, run dbus-monitor, and in another terminal, run notify-send Test 'a\nb'. This sends the literal characters \ and n to notify-send, but there's code in there that translates this to an actual newline, which you can see in the dbus-monitor output:

method call time=1717938452.236905 sender=:1.39 -> destination=:1.32 serial=70 path=/org/freedesktop/Notifications; interface=org.freedesktop.Notifications; member=Notify
   string "notify-send"
   uint32 0
   string ""
   string "Test"
   string "a
b"
   array [
   ]
   ...

But the notification which appears shows "a b", all on one line.

ilya-fedin commented 1 month ago

Ah, so it's not related to this proposal?

Mikenux commented 1 month ago

Whiteboards

I didn't see that here. However, in GNOME there is, but this is not a GNOME-only space here.

Your issue can be discussed at https://github.com/flatpak/xdg-desktop-portal/discussions for further discussion if necessary.

logiclrd commented 1 month ago

Ah, so it's not related to this proposal?

So, I'm new to this and kind of stumbling in the dark. I initially raised my concern in a GNOME desktop specific space, and the answer was, "Well, the current implementation matches the specification."

So my next question was, "Where is this specification?" And the first answer I got to that was this issue here. I'm not in a position to judge how much that makes sense. Just, I was sent here to make my proposal about notifications. :-P

ilya-fedin commented 1 month ago

I guess that was "we're allowed to do this so we're doing this" in formal language