Closed kahowell closed 4 months ago
We've requested a very similar feature: https://github.com/pulp/pulpcore/issues/4785
@daviddavis would kafka messages suffice?
@dkliban I think we were kind of hoping for something similar to how Pulp handles signing services. I think such a feature would be more flexible for users who could use it to create notifications (or whatever else they want). That said, it would be more work for users who only want notifications.
If Pulp was set on notifications, I wonder if maybe redis (or I guess valkey) be an option? Especially since it's already part of the Pulp stack. For Kafka, I do see there is a service that Azure provides that maybe we could use. I don't know anything about it but I could dig into it more if you all were thinking of only supporting it.
@daviddavis i agree that calling into a script that is provided by the administrator would be the most flexible. we could then provide some examples of scripts perform calls to web servers and send notifications in our docs.
Hmm, I think I disagree that calling a script in a subprocess (like we do for SigningServices) is the "right" thing to do here. Event / messaging services are the modern solution that exists to fill this gap.
On the other hand if you do just add a script-runner we could just write a 10-line script to send a message where we need it to go. Maybe that does make sense from a minimal-dependency perspective. But there's a not-insignificant amount of code in Pulp for registering / using SigningSerivces, and you probably would have to duplicate a lot of it to create something similar for a publication-notification script, so you may be making it harder on yourselves than just plugging in to a standard solution. Or maybe integrating with a Kafka-esque service would make sense if you had a larger use-case for it than just publication notifications.
To expand on the previous comments:
A Pulp administrator would be provided with a pulpcore-manager
command to create a Notifier
(or some other name). The notifier would be mostly a path to a script that each worker and api processes would have access to. The script would then perform any kind of action needed. The script would be provided with a list of environment variables. Some I can think of REPOSITORY_NAME
, TASK_TYPE
, STATE
.
To start with, the docs could provide an example script that POSTs the data to a web server.
For your use case @kahowell we could write a script that produces kafka messages. What do you think?
@sdherr You make a good point about the cost of implementation. There is a lot of boilerplate code that would be duplicated. Let's integrate directly with Kafka. A few places where I think a notification is appropriate are Repository Version created, Publication created, Distribution created, Distribution updated, Distribution deleted.
That sounds good. Is the idea to make this generic enough so that in theory, other tasks/events could be added eventually? Also, I think you left out one of the events that @kahowell requested (export create).
As for fields, I think ideally the message should have task id/href so that we can query the API if there is some data that we need that wasn't passed in the message.
The messages could be just integrated into the tasking system. So when a task finishes a messages is emitted then. THe message has a task href, task type, state, resources associated with it.
Webhooks and Websockets come to my mind too.
So, a thing generic enough so you could plug any sort of "connector" here. A Kafka publisher, Webhooks and Websockets, Cloud Notification services and so on.
Or we just feed Kafka and all the others are created as consumers of that.
Is your feature request related to a problem? Please describe. As a service that integrates with Pulp, I'd like to be notified when there is any change to the availability of content.
Describe the solution you'd like
I'd like to be able to subscribe to machine-readable notifications. I expect these to come across via a messaging protocol (e.g. AMQP or Kafka). Ideally, the solution should be messaging protocol agnostic, so that different deployment options can be supported. (I'd personally recommend Kafka as a baseline).
There should be a versioned schema of the notification messages. (I'd personally recommend jsonschema written in YAML format for simplicity).
For interoperability, cloud events in structured mode should be used as the message format.
Describe alternatives you've considered