pulp / pulpcore

Pulp 3 pulpcore package https://pypi.org/project/pulpcore/
GNU General Public License v2.0
307 stars 116 forks source link

Export/Publish notifications #5337

Closed kahowell closed 4 months ago

kahowell commented 6 months ago

Is your feature request related to a problem? Please describe. As a service that integrates with Pulp, I'd like to be notified when there is any change to the availability of content.

Describe the solution you'd like

I'd like to be able to subscribe to machine-readable notifications. I expect these to come across via a messaging protocol (e.g. AMQP or Kafka). Ideally, the solution should be messaging protocol agnostic, so that different deployment options can be supported. (I'd personally recommend Kafka as a baseline).

There should be a versioned schema of the notification messages. (I'd personally recommend jsonschema written in YAML format for simplicity).

For interoperability, cloud events in structured mode should be used as the message format.

Describe alternatives you've considered

daviddavis commented 6 months ago

We've requested a very similar feature: https://github.com/pulp/pulpcore/issues/4785

dkliban commented 6 months ago

@daviddavis would kafka messages suffice?

daviddavis commented 6 months ago

@dkliban I think we were kind of hoping for something similar to how Pulp handles signing services. I think such a feature would be more flexible for users who could use it to create notifications (or whatever else they want). That said, it would be more work for users who only want notifications.

If Pulp was set on notifications, I wonder if maybe redis (or I guess valkey) be an option? Especially since it's already part of the Pulp stack. For Kafka, I do see there is a service that Azure provides that maybe we could use. I don't know anything about it but I could dig into it more if you all were thinking of only supporting it.

dkliban commented 6 months ago

@daviddavis i agree that calling into a script that is provided by the administrator would be the most flexible. we could then provide some examples of scripts perform calls to web servers and send notifications in our docs.

sdherr commented 6 months ago

Hmm, I think I disagree that calling a script in a subprocess (like we do for SigningServices) is the "right" thing to do here. Event / messaging services are the modern solution that exists to fill this gap.

On the other hand if you do just add a script-runner we could just write a 10-line script to send a message where we need it to go. Maybe that does make sense from a minimal-dependency perspective. But there's a not-insignificant amount of code in Pulp for registering / using SigningSerivces, and you probably would have to duplicate a lot of it to create something similar for a publication-notification script, so you may be making it harder on yourselves than just plugging in to a standard solution. Or maybe integrating with a Kafka-esque service would make sense if you had a larger use-case for it than just publication notifications.

dkliban commented 6 months ago

To expand on the previous comments:

A Pulp administrator would be provided with a pulpcore-manager command to create a Notifier (or some other name). The notifier would be mostly a path to a script that each worker and api processes would have access to. The script would then perform any kind of action needed. The script would be provided with a list of environment variables. Some I can think of REPOSITORY_NAME, TASK_TYPE, STATE.

To start with, the docs could provide an example script that POSTs the data to a web server.

For your use case @kahowell we could write a script that produces kafka messages. What do you think?

dkliban commented 6 months ago

@sdherr You make a good point about the cost of implementation. There is a lot of boilerplate code that would be duplicated. Let's integrate directly with Kafka. A few places where I think a notification is appropriate are Repository Version created, Publication created, Distribution created, Distribution updated, Distribution deleted.

daviddavis commented 6 months ago

That sounds good. Is the idea to make this generic enough so that in theory, other tasks/events could be added eventually? Also, I think you left out one of the events that @kahowell requested (export create).

As for fields, I think ideally the message should have task id/href so that we can query the API if there is some data that we need that wasn't passed in the message.

dkliban commented 6 months ago

The messages could be just integrated into the tasking system. So when a task finishes a messages is emitted then. THe message has a task href, task type, state, resources associated with it.

mdellweg commented 6 months ago

Webhooks and Websockets come to my mind too.

decko commented 6 months ago

So, a thing generic enough so you could plug any sort of "connector" here. A Kafka publisher, Webhooks and Websockets, Cloud Notification services and so on.

mdellweg commented 6 months ago

Or we just feed Kafka and all the others are created as consumers of that.