notifications api - Githubissues

bblfish commented 8 years ago

We need to find a way for a server to be able to declare interest in a resource changing in some way. This would require each resource to point to a resource where a client can subscribe to changes in the resource, perhaps using SPARQL, with a pointer to a container to send notifications to when those changes have been triggered. Probably requires some form of authentication of the user, to verify that the notification box is the right one.

ghanemabdo commented 8 years ago

Proposal for a notification service that extends the current pubsub mechanism implemented in Gold. It should cover the following areas: 1) How clients show/revoke interest to track changes in a particular resource (Subscribe/Unsubscribe). 2) The way a server should track changes on a watched resource. 3) How a server should respond to an action happened on a resource a client is subscribed to (Publishing events).

1) Subscription/Unsubscription:

client applications register to listen to particular events that occur on a resource. The event can be:

POST/PUT a new resource (append),
Patching a resource (update) or
Deletion of a resource (delete).

subscription is done via websocket with the pod as following: "sub https://example.com/container/resource [append/update/delete] [persistent]" if no event is mentioned, the client is registered for all events happening on the resource. "persistent" here means the server should keep tracking the resource even after the websocket opened with the client is closed.

unsubscription is done by sending the following websocket message to the pod: "unsub https://example.com/container/resource [append/update/delete]" If no event is mentioned, the client is unregistered from all events happening on the resource.

In case an application is subscribed to a resource with "persistent" option and the websocket is closed without submitting unsub request, next time when the user subscribes to the same resource, a list of all events happened on this resource since last time the websocket connection is closed is posted to the client with the "ack" message (maybe it could be separated in a special message pulling changes on this resource instead of attaching changes to the ack). To stop receiving and tracking changes on a resource, the client must submit unsub websocket message for this container to the pod (Still thinking about a way to stop persistent subscriptions in case of buggy apps shouldn't overwhelm the server).

2) Server tracking events:

depends on the status of the client. If the client is:

Online: which means the websocket is open. Whenever the server receives a request with an event registered on a resource, the server automatically sends a "pub" message to all registered clients.
Offline: which means the websocket is closed. It depends on the server implementation. Currently, I think of a timestamp system that tracks the time the websocket is closed and remembers the time that every event happens on the resource. Something like a log resource of all events happened after the websocket is closed. This way the server has records of all events happening on a set of resources.

3) Event notification:

When an append/update/delete event occurs, a websocket message is sent from the pod to the client including the following parameters:

The keyword "pub",
The URI of the resource triggered the action,
The action type append/update/delete the notification should look like this: "pub https://example.com/container/subcontainer/resource1.ttl [append/update/delete]"

for "persistent" subscriptions, the "ack" message may look like:

"ack https://example.com/container/subcontainer\n
https://example.com/container/subcontainer/resource1.ttl append\n
https://example.com/container/subcontainer/resource1.ttl update\n
https://example.com/container/subcontainer/resource1.ttl delete
...
"

- Advantages of this extension:

1) Applications relying on rapid changes in a container can get resource-level updates about these changes instead of container-level updates which adds so much pain in tracking the changed resources. 2) Filter events a client app is interested in (append/update/ delete). 3) Tracking offline updates via persistent subscriptions.

- Expected problems:

1) Multiple clients registered to the same resource. 2) Buggy apps make persistent subscription to resources and don't unsubscribe. What's the scenario for terminating a persistent subscription? 3) Subscriptions to resources having frequent changes. Traffic will be enormous. Is there a way to limit overwhelming client apps with a lot of messages? like for example batching events periodically.

sandhawke commented 8 years ago

@ghanemabdo I think there might be a slightly different approach that avoids some of the expected problems. The main thing is to recognize that every resource has both a last-modified timestamp and an etag, which is essentially a hash. These are both basic parts of HTTP and help us keep things in sync.

Given these, I suggest:

pub/sub ignore containers, but rather deal with URL patterns. The client subscribes to all resource changes where the affected resource matches a particular template. Notifications include the affected URL.
'pub' lines include the etag of the resource that changed. This allows a client which remembers etags to know if it already has this version and can avoid fetching it again. Use 'deleted' as if it were an etag for a deleted resource.
'sub' lines MAY include a 'since' timestamp, giving a time they want the subscription to start, which may be in the past. The server then replies with 'pub' lines for every resource whose last-modified time is greater than or equal to that timestamp. It's free to either keep a changelog or do a scan at sub-time.
the server sends a "my timestamp is" line whenever the timestamp changes (that is, a new second has rolled around) if one or more pub lines have been sent since the last timestamp. If a client remembers the most recent of these values, when it re-connects it can do a sub-since that time and know if wont miss any changes. (sub-since 0 would be a nice way to find all URL-matching resources on a server.)

Also, servers SHOULD be careful to combine pubs. If a resource is changing more rapidly than changes can be sent, only the most recent change should be sent, when the write pipe is again available for sending. That is, there shouldn't be multiple pubs for the same resource in the output queue at once.

At some point there should be some refinements for dealing with rapidly-changing resources. Like, after a pub, you block new pubs for that resource for 200ms. So, even if it's changing every 1us, you never send more than 5 notifies per second. And that 200ms number should be set by the client (above a min set by the server). If the client showing the value to a human, the UI designer probably has an idea what max-change-rate would make sense, etc.

I think this provides everything you wanted, with simple implementations. It's also completely separate from solid/ldp, so might be used in lots of other places.

This being a different protocol, we'll need some way to tell whether we're speaking this protocol or the one currently in solid. I'd be inclined to suggest just using a different wss endpoint, found via a different link-relation.

The other big thing one might want in pub/sub is 'fat pings' -- that is, the notify gives you the new data or a patch from old-to-new. I suggest we keep that out of this protocol. I'd rather handle that kind of thing in the query protocol, where we have a know data model.

bblfish commented 7 years ago

Just discovered the provenance pingback mechanism https://www.w3.org/TR/2013/NOTE-prov-aq-20130430/#provenance-pingback

CxRes commented 4 years ago

Has there been any consideration for the proposals above since 2016?

kjetilk commented 4 years ago

@csarven has been working a lot on notifications (including leading Linked Data Notifications), so he will attend to it, but now he's also the primary author of the Solid spec, so he's a bit tied up. :-) There is more recent discussion in https://github.com/solid/specification/issues/49

CxRes commented 4 years ago

@csarven @kjetilk I have tried to wade into the spec for Linked Data Notification. From what I understood (and I admit its very little), this seems to be more about the mechanics of message transmission for Linked Data Containers.

Immediately, I am more interested in the content of messages that are transmitted by solid servers. In particular, I wonder if it is possible to introduce some non-breaking changes to NSS, such as new messages that older clients would just ignore, that can improve the user experience with little to no additional burden on the server.

Let me explain where I am coming from... I am trying to write a recursive watcher for a solid container., i.e. watch solid containers to an arbitrary user specified depth. The current pub/sub is extremely limited, making it an utter pain to get to work except in ideal circumstances, e.g. essentially forcing me to walk the container tree every time the websocket disconnects (which with my poor internet can happen every few minutes). Or keeping track of which events to ignore just because there is no way to unsubscribe to a container. Some very limited and simple common sense changes, almost all non-breaking could resolve this and make for a much more pleasant client experience.

Is it possible to discuss this in the near-term?

solid / notifications

notifications api #22

1) Subscription/Unsubscription:

2) Server tracking events:

3) Event notification:

- Advantages of this extension:

- Expected problems: