pubsubhubbub / PubSubHubbub

The PubSubHubbub protocol specification.
http://pubsubhubbub.github.io/PubSubHubbub
601 stars 122 forks source link

Specify how publishers notify hubs #33

Open cweiske opened 9 years ago

cweiske commented 9 years ago

Version 0.4 of the spec only defines how subscribers and hubs interact with each other, but not how publishers notify the hub about updates.

It is crucial that this is specified, too - otherwise it is not possible for publishers to switch hubs without modifications to the code.


Superfeedr uses a POST to the hub URL with hub.mode=publish and hub.url=$url_that_was_updated - see http://documentation.superfeedr.com/publishers.html

Google's hub does the same; https://pubsubhubbub.appspot.com/


Hubs can still implement other ways of notifications, but they all should support a standardized way to be notified.

pfefferle commented 9 years ago

The v0.4 doesn't specify the Publisher process intentionally. It is up to the Hub to define this process:

The publisher MUST inform the hubs it previously designated when a topic has been updated. The hub and the publisher can agree on any mechanism, as long as the hub is eventually able send the updated payload to the subscribers.

http://pubsubhubbub.github.io/PubSubHubbub/pubsubhubbub-core-0.4.html#rfc.section.6

cweiske commented 9 years ago

This is what needs to be fixed. Leaving it up to the hub leads to the problem I mentioned: it is not possible for publishers to switch hubs without modifications to the code.

aaronpk commented 9 years ago

I agree with @cweiske on this one. It's impossible to switch hubs with no code change unless there is a standard for how publishers notify hubs.

Related, what is the reason the first two hubs use "hub.url" as the name for the URL that is updated? It would seem to make more sense to use "hub.topic" as that is the name of the parameter that subscribers used to subscribe. In my opinion, the spec should require publishers send hub.mode=publish and hub.topic={{topic url}} for consistency.

julien51 commented 9 years ago

It is crucial that this is specified, too - otherwise it is not possible for publishers to switch hubs without modifications to the code.

But even if that happens, there is a change of configuration/code to be made by the publisher... so I'm not sure this needs to be part of the spec.

Generally, I believe the spec should be about one single concern: how the subscriber gets content from a resource they care about.

Let's say subscriber S is able to get content from publisher P via hub H1 using the current PubSubHubbub protocol. If P switches to H2 but uses a different mechanism to notify them. It does not change anything for S. Then, why would the P<->H1/H2 relationship be specified (since it does not affect the end result)?

pfefferle commented 9 years ago

But even if that happens, there is a change of configuration/code to be made by the publisher... so I'm not sure this needs to be part of the spec.

But the changes are not that significant. The WordPress plugin pings all hubs the same way and that makes it very easy to implement.

What about a default way to ping hubs and a message that other ways are allowed/welcome?

voxpelli commented 9 years ago

+1 on defining a default way – it doesn't even have to be the default way – just a way.

Currently it is very easy to switch from Google's hub to the Superfeedr hub or Aaron's hub because they all use the 0.3 publisher notifications – one just changes the URL of the hub and one is done. I think it's beneficial to Pubsubhubbub to keep it that way, but it's unfortunate that one currently have to rely on the old 0.3 spec to implement it.

A solution that could solve both @julien51's concern and the wish for a defined publisher<->hub relation could be to break out the 0.3 publisher notification part into a new spec that lives side by side with the current 0.4 spec and handles the publisher<->hub relation while leaving the current 0.4 spec focused on the hub<->subscriber relation?

aaronpk commented 9 years ago

Frankly, not having the publisher->hub payload in the 0.4 spec feels like a failure of a spec, similar to how OAuth 2.0 core is considered a failure. Having a separate spec for the publisher->hub payload also points to a failure, which is exactly what OAuth 2.0 is doing now.

There's nothing wrong with allowing hubs to offer more functionality for publishers, but there absolutely needs to be a common payload. Being able to change hubs by simply changing the hub URL advertised in the link header is critical to a successful standard.

julien51 commented 9 years ago

Well, let's think of it another way: if we define/specify THE WAY™ publishers and subscribers should interract, would you say that for these hubs and publishers who do not use that specified mechanism that they're not PuSH-compliant?

I agree interrop is key and important but making the spec arbitrarily strict (when it's not needed) is not going to make more people adopt it.

aaronpk commented 9 years ago

There's no point in adopting a spec that doesn't actually tell you what to do, other than to have the official stamp of approval of that spec. Right now, any number of completely incompatible implementations of can call themselves OAuth 2.0, but is that really a success? Is that really what we're shooting for?

At a minimum, I would like the PuSH spec to define one way publishers can notify hubs, (hub.mode=publish&hub.topic=x) and in order for a hub to be PuSH-compliant it MUST support at least that way. Hubs should be welcome to support additional methods, such as the two ways already implemented by two different hubs. (multiple topic URLs by adding additional hub.url parameters, and wildcard topic URLs)

pfefferle commented 9 years ago

I think this becomes a debate on principles. I agree with @aaronpk a protocol should be as easy as possible and therefore it should define an easy and simple way for subscribing and publishing. All other ways should be optional.

pfefferle commented 9 years ago

From the publisher perspective it is only possible to implement a generic publisher if there is a generic way to publish updates. And the WordPress plugin is only that easy to implement/run because there is a generic way in V0.3. If Hubs will implement different ways, I have to update the plugin for every single Hub, and that can't/shouldn't be the expected case.

andyleap commented 9 years ago

There needs to be a separate portion of the spec or something, cause I don't like the idea of being told that my site isn't PuSH 0.4 compatible cause my hub doesn't take a specific pub request (my hub is built in and my publish is sent via function call)

julien51 commented 9 years ago

What do we do for publishers who are their own hubs, like Wordpress.com? It probably does not make a lot of sense for them to implement the one way required for them to be compliant. Again, we're adding complexity where it does not belong. We're just creating second class citizens for the sake of specifying everything.

Once the publishers picks its hub, the way it will have to ping the hub is clearly one of the things to ponder, as is https support (not specified either BTW) etc... And since the publishers chooses its hub, it can always choose another one at its own discretion.

voxpelli commented 9 years ago

So to be clear about what @julien51 and @andyleap says – there are two scenarios:

  1. Publisher and Hub is contained in same application – how the Publishing part of that app tells the Hub part of it is entirely up to that app. See eg: https://wordpress.org/plugins/pushpress/ and @andyleap's use case
  2. Publisher and Hub is contained is two separate application – then it would probably be preferable if they used a standard mechanism to communicate between them so that they are nicely decoupled from each other

The scenario that a standard solution is requested for here is 2. – right? But we can all agree that any solution to that shouldn't exclude the scenario of 1. to still be valid – right?

I'm pro a recommended mechanism for solving 2. – but I also very much want 1. to be a valid implementation as well.

julien51 commented 9 years ago

Back to the root, I'd rather have a discussion on what's the greater goal of the protocol. For me:

The only thing that matters to me is that subscribers are able to get data from any publishers using the same mechanism. It should do that, and just that.

It's irrelevant that it does or does not do anything else.

andyleap commented 9 years ago

I would be happy with something like:

A hub SHOULD(or possibly MAY) support publish requests from the publisher over HTTP or HTTPS If a hub supports HTTP(S) publish requests, it MUST support at least the following parameters: hub.mode REQUIRED. The string "publish" to signify a publish request hub.topic REQUIRED. The topic URL that the subscriber wishes to subscribe to or unsubscribe from.

pfefferle commented 9 years ago

I thought PubSubHubbubs initial plan was to make it easy to publish and to subscribe, thats why all the magic stuff is done by the hub.

@julien51 I don't think wordpress.com is a good example, because they don't have a public publisher-API, but perhaps exactly that might be the compromise:

kylewm commented 9 years ago

My 2¢: I tend to agree with @julien51. if some implementations can choose to ignore the proposed additions to the spec and still work perfectly well with all consumers, then the additions don't seem necessary. By adding arbitrary restrictions, you risk making it confusing for readers/implementers in knowing what is and isn't required for a functional hub.

aaronpk commented 9 years ago

This really is just pointing back to the question of what is the original goal of this spec. If the goal is to describe a way for consumers to subscribe to content, then yes, publisher->hub is not needed as part of the spec. Since that's what @julien51 described, then it sounds like the best course of action is to document the publisher->hub interaction separately, with the goal being to make it possible for publishers to use generic hubs.