AutoAuth, private feeds and WebSub

sknebel commented 5 years ago

Not part of the main specification, but important for private feeds and worth documenting.

I see three models:

1. WebSub informs all users about all changes to the feed

When the feed changes, the site triggers a notification to all subscribers. If the feed change is in a private post, it does not include it in the ping, either sending the last public state of the feed or an empty ping (effectively announcing an empty diff). Authorized subscribers would take this as a signal to fetch the feed with authorization attached.

This leaks the fact that something private happened to all subscribers. The hub is not involved in handling private information at all, and thus can safely be external.

Do all/most hubs allow this? A hub that wants to create the diff itself might reject sending out an empty notification, but at least for non-Atom content I don't think hubs do this.
How do subscribers handle empty pings? Does it cause them to fetch the page, assuming a "thin ping"? This could be mitigated by separating out authorized subscribers as described below. (WebSub does not know "thin pings", but I think pubsubhubbub did and clients might support them)

2. individual topics for private subscribers with fat pings

The site could give different topic URLs (capability URLs) to private subscribers, and send matching notifications to them. (Compare how the WebSub spec recommends returning different rel=self URLs for different content types. Potentially, fat pings could be used then.

A site could use an integrated hub for private subscribers and still let a public hub handle everyone else.

The hub here can't fetch the private topic URLs (unless it has special support/is integrated).
If the capability URL leaks, others can subscribe to it and would receive notifications. This would compromise fat pings. Subscribing applications and hubs would need to take care to not leak this, but hubs developed assuming public feeds might not do this. Integrated hubs could only allow one subscription per topic URL, which could mitigate this when each time a different capability URL is submitted.
integration with token expiry/revocation is needed: the link between token and capability URL must be maintained, and topics associated with invalid tokens not updated anymore.

3. individual topics for private subscribers with thin pings

Compared to 1, it at keeps activity private and solves the issue mentioned above of subscribers potentially fetching needlessly. Compared to 3, it removes complexity, trust in the hub and leaking the cability URL is less problematic, but requires feed fetches on notification.

conclusion

I think 2. is too much complexity. I think it makes sense to document 3., and potentially 1. as an easier option. Testing how it works with existing clients and hubs is needed.

Thoughts/comments?

(Originally published at: https://www.svenknebel.de/posts/2018/12/6/)

fluffy-critter commented 5 years ago

The problem with style 1 is that some hubs (such as superfeedr, which is the one suggested by indieweb) only forward new content along to subscribers - it won’t even include updates to specific items. That is why in my proposal I included a placeholder public entry that has a URN that changes every time there’s new content that can be fetched (and which realistically will probably be the URN+link of the most recent private entry, since that at least supports legacy readers albeit by leaking information).

I feel like 3 also increases the complexity of managing pings on the publisher’s side, and the issue of topic discovery becomes a problem as well. It also adds complexity and overhead to the case of a single feed reader shared by multiple subscribers, since now you need one topic and ping per subscriber. And having hundreds/thousands of topics makes superfeedr grumpy and expensive. :)

fluffy-critter commented 5 years ago

Basically I never want to have to put a hub into my CMS directly, as that requires that my CMS maintain persistent state about subscribers and where they subscribe from. The entire design of publ is that the database is fragile and that no external interaction affects its persistent state. It should be able to be rebuilt around a git deployment to a fresh server with no database persistence.

I basically want it possible for the WebSub support to be completely decoupled from the AutoAuth support from the CMS’s perspective, and also cater to the reality of how the existing public WebSub hubs work.

sknebel commented 5 years ago

If a placeholder entry is what you need to get your hub to send notifications that'd of course be a valid way of implementing 1. If the feed is separated (either through a modified rel=self for authenticated requests or the proposed rel from #6), that placeholder entry also wouldn't cause issues with "normal" clients? Otherwise I'd expect it showing up on every change as an unread item in a traditional feed reader, which would be bad?

From a quick survey of some options, Switchboard and phubb do just check if the file has changed in any way, so an explicit item is not needed. Superfeedr tries to be intelligent about feed items and thus would likely need a placeholder being added. (superfeedr also has an option to submit your own payload, but that's only for Pro accounts and thus not really interesting for most of us: $200 per month). The pubsubbhub.appspot.com one is unknown, it's described specific to Atom/RSS but that might be left over from earlier. Am I missing important implementations/public hubs?

If 1 doesn't tie specifically to contents of the placeholder items, the subscriber parts of 1 and 3 also appear to be identical, so a publisher could choose which one is preferred.

fluffy-critter commented 5 years ago

I actually would want a placeholder entry to show up in legacy/non-auth-aware clients - that way people who are following my blog that way know that there's something they might be able to read if they log in.

The implementation I'm intending is a compromise between privacy and access for folks who don't follow via the latest and greatest feed reader implementations. It's not perfect but it's what I had on my previous site (using a standard cookie jar for people to store their auth cookie, although I have no evidence that anyone ever actually did that) and it worked Pretty Okay. Obviously other people will want to have a different balance and that's okay - a flexible standard can support everyone. :) (I'm actually intending to have placeholders for all private entries, not just the most recent, but having just the most recent one is enough to get Superfeedr to work right.)

Down the road when/if more feed readers support authenticated feeds I would of course revisit this, but I'm still looking towards a graceful transition.

I also want to ensure that it's done in a way that makes client support easy to add. It seems easier to me for a client to get a placeholder update in the WebSub notification than to have to handle a thin push and re-fetch the whole feed as a result. Having a placeholder entry with "please auth here" metadata also works for the case of an auth-aware reader knowing that they don't have auth and thus not bothering to fetch the authenticated item.

Also it sounds like I should look at Switchboard and phubb and make sure I support their various variants of WebSub correctly. :)

sknebel / AutoAuth