Extract "meta" as a type

sselberg commented 3 years ago

Description

All events have a copy/paste duck-type-equal meta property. Extract this as a specific type.

Motivation

Limits code-duplication and limits event-description complexity.

Exemplification

Benefits

If there needs to be a change in the meta property you'll only need to change it in one place. Models of eiffel-events could use a type-safe approach without violating the specification. The (potential) various versions of the meta-object become evident.

Possible Drawbacks

e-backmark-ericsson commented 3 years ago

Actually it was that way some years ago, but the definitions of meta (and links) was decentralised based on this issue: https://github.com/eiffel-community/eiffel/issues/136. I'm not saying we could not re-evaluate that decision, but please look into that issue and the corresponding PR explaining the reason for decentralising it.

sselberg commented 3 years ago

The reasoning seems to have been that the meta objects could possibly diverge which would cause problems. This hasn't happened in the four years since, the meta objects are still identical.

In the previous iteration of a shared meta-object the meta-object wasn't versioned, in my version it is. So in the event that the meta object would diverge between events (however unlikely) it would be apparent which version meta-object belonged to which event.

There's an overarching problem with the current Eiffel events specification reluctance to reuse de-facto identical objects since in strongly typed contexts there can only be one Meta object. The alternative to reusing the object would be to specify that it's not the same object by not reusing "meta" as property.

EiffelSourceChangeCreatedEvent {
   SourceChangeCreatedEventMeta sourceChangeCreatedEventMeta;
}

The problem becomes apparent when you need to rename properties in strongly typed contexts.

The meta seemed like an obvious place to start since it has been identical for all Events for the past 4 years. From my perspective it would be easier to cross the bridge of diverging meta objects when/if we come to it.

WDYT?

sselberg commented 3 years ago

The bottom line is:

The commit removes 5000 lines of duplicated code (that hasn't diverged in 4 years) without changing the de-facto specification.
This duplication is not limited to this specification, it is replicated by every event producer/consumer in every Eiffel-system around the globe.

From my perspective it would be very beneficial if the Eiffel specification went the other way and stated the the "meta" object was not supposed to diverge between events and that event-specific data belongs in the "data" object. This would be a good step in the direction of making the adoption of Eiffel a lot smoother. IMM event-specific meta-data is not really meta-data but actual data.

Another drawback of the current situation is that since the properties are named the same ("meta") it implies inheritance (this object has type meta) which is dishonest if they are totally separate objects with no similarity other than that they happen to be formatted the exact same way.

d-stahl-ericsson commented 3 years ago

Great input! There's no obvious right or wrong here, it's a trade-off. The fact that it hasn't changed for four years is a good point - except that it is wrong :) The meta format was last updated in late 2018. But let us not split hairs - it has so far proven to be fairly stable, and may well continue to be fairly stable in the future.

The argument of dishonesty I think misses the point, though. Each event type is independent from every other event type. It constitutes its own namespace, if you will, so there is no naming collision per se.

I'm not sure how moving fields from .meta to .data would solve the problem of boilerplate, though. Supposedly the same fields would need to be represented, and unless we want to extract them (and cause dependencies between versions of events and versions of data elements common to multiple events) we would be moving the boilerplate from one section to the other. But I'm not sure you're suggesting that.

The proposal of creating versioned objects containing shared data fields would work. I'm not sure it's beneficial, though. It's all about whether you want to manage some copy-pasted information or if you want to manage some dependencies. But let's be clear that documentation and implementation are two separate things: writing an event serializer for a specified set of events, I would obviously represent .meta as a shared type.

sselberg commented 3 years ago

"The fact that it hasn't changed for four years is a good point - except that it is wrong" There's a difference between changing and diverging, I said that the meta object hasn't diverged (implicitly "... between the various events") in 4 years which is true :-)

I don't know that I suggested that we move fields from meta to data. What I was trying to convey was that, if there's a field that would differ between events you could take the standpoint that it belongs in the data object, since the data object is event-specific and meta is not If one were to take my suggested approach and claim that all objects share the same meta object.

"The argument of dishonesty I think misses the point, though. Each event type is independent from every other event type. It constitutes its own namespace, if you will, so there is no naming collision per se." "But let's be clear that documentation and implementation are two separate things: writing an event serializer for a specified set of events, I would obviously represent .meta as a shared type."

How do you view this as documentation and not specification? Does this not specify how an event of a specific version is constituted? Implementation that does not follow specification is flawed IMO. If meta object isn't shared in the specification it doesn't make sense to share it in any implementation since it's then not the same object. But your comment makes it clear that you actually share my view that it is the same object and to pretend anything else (by not specifying it as such) is dishonest IMOHO.

I have set out to implement parts of the Eiffel specification and the lack of reuse struck me as rather awkward. To make matters worse lack of reuse makes the specification a lot harder to read. When getting an understanding for a new event you would only have to bother with reading the data and links properties, currently you have no choice but to read the meta property as well as "it might differ from any other event". Furthermore there are a lot of data properties that should be shared as well such as "gitIdentifier", "personIdent" (differently named, author etc. but essentially the same object), in fact the only to similarly named data properties that aren't actually the same is (IIRC) outcome which differs from the various .Completed events but is named the same.

d-stahl-ericsson commented 3 years ago

I think we could get lost in a rather philosophical conversation on what it means to "follow" a specification, and the difference between documentation, specification, protocol... if any. But let's avoid that, those are better saved for musing over a beer at some point.

To stay pragmatic: Yes, there are multiple data elements that are de facto identical, but not defined as shared entities. This was a trade-off decision made, fully recognizing that there are pros and cons:

The current system introduces copy-pasted boilerplate code. And we all know why that is a bad thing.
The alternative requires separate versioning and dependency management of shared data elements, which introduces its own complexity and risk of inconsistency.

I can personally see both sides of the matter, and have no strong opinion. Let's see if we can get some more input from the community?

sselberg commented 3 years ago

I'm not really invested in what to call it, but IMO this documentation specifies what Eiffel events look like, you might have a different idea of what it does and I might be mistaken regarding the intentions of this documentation, it might be meant to do something entirely different. By your comment "But let's be clear that documentation and implementation are two separate things" you make it seem as though this is some sort of common-knowledge when my guess would be that a majority would agree that implementation is in most cases very much affected by documentation and rightly so. This is also the biggest reason for me suggesting this change, I believe it would make the implementation of Eiffel a lot less painful. Furthermore it would require a lot less effort to read and understand this documentation which would somewhat lower the treshold for those who might be interested in starting to use or contribute to Eiffel. First impressions are lasting impressions.

magnusbaeck commented 3 years ago

I think it would be reasonable to extract the meta field into a type of its own. There are philosophical arguments to be made in either direction but I'll offer some thoughts from the perspective of Eiffel event serialization which I've had to deal with a fair amount recently (Java implementation in github.com/jenkinsci/eiffel-broadcaster-plugin and some so far unreleased Go code).

For an event publisher it doesn't really matter since you typically only need to support publishing events of the most recent version, i.e. you'll have your EiffelArtifactCreatedEvent type which references a Meta type which can be shared among all event types that you currently publish. However, when there's a need to deserialize events you'll likely need one type per major version of each event, e.g. EiffelArtifactCreatedEvent3 and EiffelArtifactCreatedEvent4, each including the fields from the most recent event with that major version.

Now, if the schema of meta changes between these major versions, what do you do? Introduce a new Meta type, sure, but how do you name it? The concrete events aren't versioned in lockstep so you can't talk about a Meta v5 that applies to all v5 events. Each serializer library author would have to make up their own naming strategy, or continue the duplication with type names like EiffelActivityStartedEventMeta (cf. EiffelActivityStartedEventMeta.java in eiffel-remrem-semantics).

Related, how do you go about generating event types that work in a sane manner if the schema doesn't tell you that the meta fields of different events actually have the same type? With separate types, isn't there a risk of significant pain with typed languages if you e.g. want to write shared code for validating meta.security?

m-linner-ericsson commented 3 years ago

Sorry for my late reply...

I would prefer either to keep it as simple as possible and we have it today with each event defining the needed meta information or go with a global meta definition, i.e. if you update the meta definition we would release a new version of the Eiffel protocol. Going with a global meta object breaks what we have today but then we would explicitly say that the meta object is the same for all events.

If we have a separate meta object we could have the situation where ArtP has meta v1 and ArtC has meta v2. For me this introduces complexity to remove duplication and then I would prefer more duplication instead of complexity (even though I understand the problem of "what is the difference in the metaobject between these two events"). We either have "the version of the event contains all needed information" or "EventA has meta version X and EventB has meta version Y". The introduced dependency complexity doesn't solve the problem only moves it in my opinion. For me the decentralization change feels logical as we treat all object types the same.

When it comes to duplication, the link object in ActC and ActF are "same" (the description differs slightly). If we would break out the meta object we will need to specify and explain why we extracted the meta object but not this link object. Might sound silly but the curerent protocol have consistant rules and say the each event lives by itself. Creating a more complex structure requires more documentation.

If we think about the future and if ever we want to put Eiffel on top of CloudEvents. How would meta look then. Would we have the meta part as an extension to the context attributes adding them as a attribute extension or do we want our metadata inside the event data?

sselberg commented 3 years ago

Sorry for my late reply...

I would prefer either to keep it as simple as possible and we have it today with each event defining the needed meta information or go with a global meta definition, i.e. if you update the meta definition we would release a new version of the Eiffel protocol.

If I had any say in the Eiffel community this would have been my approach as well:

All events share the same Meta object.
If there's a new version of the Meta object all events are "upgraded" to use this Meta object.

Going with a global meta object breaks what we have today but then we would explicitly say that the meta object is the same for all events.

I don't see how it breaks anything. From a consumer perspective, if you are under the assumption that the meta objects may differ between events and you have taken this into consideration when developing clients and services that consume/produce Eiffel events but they in-fact never differ, you've thrown away valuable development time but it will still work.

If we have a separate meta object we could have the situation where ArtP has meta v1 and ArtC has meta v2. For me this introduces complexity to remove duplication and then I would prefer more duplication instead of complexity (even though I understand the problem of "what is the difference in the metaobject between these two events"). We either have "the version of the event contains all needed information" or "EventA has meta version X and EventB has meta version Y". The introduced dependency complexity doesn't solve the problem only moves it in my opinion. For me the decentralization change feels logical as we treat all object types the same.

Personally I wouldn't go this route due to the complexities it introduces (as you explained above). I would opt for pragmatism and let some fields be unpopulated for certain type of events rather than having multiple Meta objects simultaneously.

When it comes to duplication, the link object in ActC and ActF are "same" (the description differs slightly). If we would break out the meta object we will need to specify and explain why we extracted the meta object but not this link object. Might sound silly but the curerent protocol have consistant rules and say the each event lives by itself. Creating a more complex structure requires more documentation.

TLDR; If this, first step, was accepted by the community I was planning on spending some time making this [description|documentation|specification|...] easier to read and interpret for future adopters.

As a consumer of this I found it unnecessarily cumbersome to interpret it due to the unnecessary complexity introduced by lack of reuse. When venturing into actual development it became clear that it also made implementation in a typed environment awkward and complex. I set out to fix these (IMO) flaws and started with the most apparent one, the Meta object, to see if this was something the community would be interested in getting fixed. If so I would spend some more time and energy to remove other duplication and make actual separation more apparent (for instance the "outcome" attribute that is named as if it is the same object when in fact it is unique for each event that has it. One of those quirks that makes implementation in a typed environment difficult.)

If we think about the future and if ever we want to put Eiffel on top of CloudEvents. How would meta look then. Would we have the meta part as an extension to the context attributes adding them as a attribute extension or do we want our metadata inside the event data?

Why would you feel the need to do anything differently than how you would solve it today?

sselberg commented 3 years ago

I'm not too familiar with how the Eiffel community works but it seems like this change was not a home-run in terms of reception :-). Are there any reasons to believe that this will be accepted sometime in the future or shall I close the issue and PR as "Won't do"?

m-linner-ericsson commented 3 years ago

@magnusbaeck I guess we will bring this up next Thursday?

sselberg commented 3 years ago

My bad. I somehow understood it as if that meeting was last week.

magnusbaeck commented 3 years ago

This issue will be discussed at the community meeting on November 11, see https://groups.google.com/g/eiffel-community/c/rNs3NY93rEE/m/DdMABZTABQAJ.

sselberg commented 2 years ago

Pausing this initiative until #282 is done. Consequences of the breakout of meta object will most likely be more clear once 282 is complete and the implementation should be a lot easier since we can break out meta in the underlying structure but f.i. keep the json schema files "flat".

eiffel-community / eiffel