topcoder-platform / tc-project-service

16 stars 55 forks source link

Unify Kafka messages payloads #472

Open maxceem opened 4 years ago

maxceem commented 4 years ago

Since V5 we have unified Kafka topics to always have the next format:

So far the payload which we send for each of the topics is not unified, but in most cases the payload looks like this:

"payload": {
   "resource": "project.template" // the name of "resource", one of https://github.com/topcoder-platform/tc-project-service/blob/feature/restful-invites/src/constants.js#L326-L347

   // other properties of the object
   "id": 1234,
   "name": "Name of project template",
   "key": "app",
   "createdAt": 1,
   "createdBy": 1,
   "updatedAt": 1,
   "updatedBy": 1,
}

As we want to stop sending specific Kafka events to TC Notification Service we would like to use these general V5 Kafka events instead, for generating notifications.

Though, to be able to generate some notification we have to know the object data before update and after update. So far it's handled inside Project Service using App Events, see:

So we have to update Kafka messages payload to include both original and updated object in project.action.update Kafka topic payloads.

I propose to use the next unified rules for all 3 topics:

{
  "topic": "project.action.update",
  "payload": {
     "resource": "project.template" // the name of resource, same like now
     "data": { ... } // updated object
     "previousData": {} // previous object
   }
{
  "topic": "project.action.create",
  "payload": {
     "resource": "project.template" // the name of resource, same like now
     "data": { ... } // created object
   }
{
  "topic": "project.action.delete",
  "payload": {
     "resource": "project.template" // the name of resource, same like now
     "data": { ... } // deleted object
   }

It was clarified during V5 migration, that we can format the payload as we need:

image

Note, that we would have to update project-processor-es and legacy-project-processor to consume Kafka messages in this new unified format, though I don't think any problems with this.

vikasrohit commented 4 years ago

@maxceem I am not sure if I fully understand the reason why we need both original and updated data in tc-notifications. The code snippets you shared shows the we had the original and updated resource in tc-project-service it self and using that to determine few conditions to generate additional events for single base event.

maxceem commented 4 years ago

@vikasrohit as I understand from https://github.com/appirio-tech/connect-app/issues/3508#issuecomment-581299520 and https://github.com/appirio-tech/connect-app/issues/3508#issuecomment-581333194 TC Notification Service should stop consuming our special events like these https://github.com/topcoder-platform/tc-project-service/blob/hotifx/metadata-index/src/constants.js#L176-L246.

Instead of such special events, TC Notification Service should consume our new V5 events:

Though TC Notification Service should still generate the same notifications as it does now. For example, TC Notification Service should create a notification connect.notification.project.updated.spec if we modify the spec of the project. So far for this purpose inside Project Service we check if project.details has been changed and then send a special Kafka event connect.notification.project.updated.spec which is consumed by TC Notification Service.

Now we should stop sending such an event, and we can only send project.action.update when the project is updated. But so far, in the project.action.update event we only send updated project, so inside TC Notification Service we don't know if the project scope has been changed or no. So we cannot generate notification about this event. To do so, event project.action.update should have 2 versions of project before and after modification, so inside TC Notification Service we can understand that project scope has been changed and generate connect.notification.project.updated.spec notification.

vikasrohit commented 4 years ago

Understood @maxceem Thanks for clarifications. I understand the need of accessing previous data in event receiver and in fact that is very logical to have old and updated data for the update events which I think we were doing at some places previously but I guess with v5 introduction we lost them in favour of v5 standards.

I am okay with the new event payload structure except one thing, how would be handle the backward compatibility. I mean how the system would react to the already raised events. I guess, this should not be much of a problem because the actual notification events are different and probably would be different than these generic events, right?

maxceem commented 4 years ago

I mean how the system would react to the already raised events. I guess, this should not be much of a problem because the actual notification events are different and probably would be different than these generic events, right?

@vikasrohit I think we should remove supporting of the old events in Notification Service. As we only generate such events in Project Service so fully control them, we can deploy the next way:

NOTE. Actually TC Notification service also consumes events from Message Service, so we probably also have to update events in Message Service so they follow V5 standard.

vikasrohit commented 4 years ago

Yep, agreed. I am not much worried about the events that are in transition when we upgrade project service to generate new events. However, I just want to make sure that we are not using those events in rendering any notifications so that we don't miss showing any notification of a user after the upgrade which I think is not the case because we use different non v5 standard events for actual notifications.

maxceem commented 4 years ago

However, I just want to make sure that we are not using those events in rendering any notifications so that we don't miss showing any notification of a user after the upgrade which I think is not the case because we use different non v5 standard events for actual notifications.

Ok, I think I got the concern.

We indeed currently have a match between Kafka events that are sent by Project Service and notifications which we generate. In Notification Service we even use the name of Kafka events as a type of notifications https://github.com/topcoder-platform/tc-notifications/blob/dev/connect/events-config.js#L64. So when we would remove these non-V5 Kafka events we still would have to generate all the notifications which we did generate before. But instead of using special non-V5 Kafka events, we would use new V5 unified Kafka events.

So when before we had the next workflow:

Now we would have a new workflow:

Note, that for notification type connect.notification.project.created we would NOT have any Kafka topic anymore.

If all the above sounds good for you, I would start with a challenge spec for this.

vikasrohit commented 4 years ago

Note, that for notification type connect.notification.project.created we would have any Kafka topic anymore.

You meant we would **NOT** have any Kafka topic anymore., right?

maxceem commented 4 years ago

@vikasrohit right, updated comment.

vikasrohit commented 4 years ago

I am good with the solution now. I was wondering though, if we can some how (in future) support versioning of the payload schema. I mean lets say we say it as v1 with our current format and v2 to the new one you proposed @maxceem. Now, to give sufficient time to the event consumers to adapt to the new format, we can raise both events in both formats and then deprecate and finally remove old version after some time. Also, can we just alter the payload for the update events to have original and updated fields to store previous and updated data? That would limit the number of places we have to adapt the change in format. Do you think it is logical, as long as we have it documented, to have original and updated only for update events?

maxceem commented 4 years ago

I am good with the solution now. I was wondering though, if we can some how (in future) support versioning of the payload schema. I mean lets say we say it as v1 with our current format and v2 to the new one you proposed @maxceem. Now, to give sufficient time to the event consumers to adapt to the new format, we can raise both events in both formats and then deprecate and finally remove old version after some time.

I think we would be able to do that after we migrate the payload format proposed in this ticket https://github.com/topcoder-platform/tc-project-service/issues/472#issue-563832076. As soon as we move out actual document data from payload to a nested property like data or origin, we can add additional properties to the payload and they would never conflict with the document data. So we can have payload with version like this:

payload: {
   version: 2,
   data: {},
   resource: "project",
   newProperty: "..."
}

Theoretically, we could also add version outside the payload, but it's not supported by the current Topocder Standard for Kafka topics, so we have to keep inside the payload.

Also, can we just alter the payload for the update events to have original and updated fields to store previous and updated data? That would limit the number of places we have to adapt the change in format. Do you think it is logical, as long as we have it documented, to have original and updated only for update events?

I've been also thinking about it for a while and finally in favor of having a common data field for all the events + additional field previousData for updated events. The reason is that in most places of updated event we would only need "updated" data and don't need "original". For example, project-processor-es handles all kind events create, updated, delete in a unified way and we handle 18 kinds of objects here. While we only have to change 6 "update" handlers for Tc Nofitcations to use data and previousDate instead of updated and origin.

Though the general reason is not the amount of work now to update the code, it just feels using "data" field across all field types would be more unified way, and code to handle such events could be also more universal, rather than if "update" events would have a different payload.

vikasrohit commented 4 years ago

Theoretically, we could also add version outside the payload, but it's not supported by the current Topocder Standard for Kafka topics, so we have to keep inside the payload.

I am in favour of this as it gives the listener opportunity to detect version of the payload without parsing the payload. Lets create a issue for it and continue thinking on it.

For the use of data as standard field for all v5 events, I am also not able to think any specific reason for using original and updated for update events only except the less number of changes required for now. Lets go with the schema you proposed.

maxceem commented 4 years ago

Follow-up issue for supporting "version" https://github.com/topcoder-platform/tc-project-service/issues/476

vikasrohit commented 4 years ago

@maxceem can we close this one? If yes, when do you think we released the changes 2.3 or 2.4?

maxceem commented 4 years ago

@vikasrohit it's not yet implemented.