MellonScholarlyCommunication / spec-eventlog

This document specifies the requirements for implementing and hosting a Artefact Lifecycle Event Log.
https://mellonscholarlycommunication.github.io/spec-eventlog/
0 stars 1 forks source link

Explain in section 5 the relation and difference between an Event Log and other technologies #7

Open phochste opened 2 years ago

phochste commented 2 years ago

In https://github.com/MellonScholarlyCommunication/spec-notifications/issues/20 and our bi-weekly technical Mellon meeting we had a discussion what the relation and differences of an Event Log is with other types of technologies (e.g. such as the as:outbox of ActivityPub). These differences would best be reflected in the Event Log spec to better explain the rationale for an Artefact Event Log (e.g. in section 5) and the differences with other technologies

A recap of the observations that resulted from our discussions in these channels

--

--

--

hvdsomp commented 2 years ago

I was thinking more about something @mielvds said during the discussion regarding scope of the scholcomm event log, namely that maybe only the events in which value was effectively added to an artifact go into that event log. Meaning that Alice's offer would not go in. But a message from a service hub that states that e.g. registration happened for the artifact would be turned into an event in the event log. Although, during the discussion, I responded that I felt that Alice's offers and - for example - rejections thereof also provide information that supports transparency re open science, I could also embrace the perspective of @mielvds . Doing so would mean that events in the event log would always be based on notifications coming in from third parties, never based on Alice's own actions:

This is not to say that the fact that Alice made an offer that was rejected should not be saved somehow. Just like many other things that happen around the pod, it could be. The question is more whether that information should be public, which the scholcomm event log is.

A lot of this points to the potential existence of multiple logs, with the scholcomm event log being one that is supposed to be public and helps with transparency of schol comm. Maybe this kind of perspective is also helpful for the NDE case that does not deal with registration/certification/etc and hence does not need a scholcomm event log. But it needs another event log that serves another purpose ...

mielvds commented 2 years ago

Doing so would mean that events in the event log would always be based on notifications coming in from third parties, never based on Alice's own actions:

Mostly, yes. But perhaps "registration was requested from" is an event worth logging, ie. Alice's requested it, and is currently waiting.... but the more I think of it, the more I feel like only "service responses" could end up in the log, which is what you conclude.

We should try listing some collector use cases: why does the collector reconstruct artefact lifecycles? An obvious one is: "I found a paper and I want to know whether it's legit" What do you need to know in order to come to that conclusion?

* Notifications re value added by service hubs (e.g. registration) are turned into events

* Notifications re value added through interactions with an artifact by peers (interaction events in the Mellon proposal) are turned into events

This is not to say that the fact that Alice made an offer that was rejected should not be saved somehow. Just like many other things that happen around the pod, it could be. The question is more whether that information should be public, which the scholcomm event log is.

Yes! Without going too much "if a tree falls in the forest": should an artefact have at least a "creation" event with basic metadata for cases where no services were involved yet? For example: Bob wrote a paper, but hasn't submitted it yet. However, he does want collectors to be able to discover it. What makes an artefact discoverable by the collector?

A lot of this points to the potential existence of multiple logs, with the scholcomm event log being one that is supposed to be public and helps with transparency of schol comm. Maybe this kind of perspective is also helpful for the NDE case that does not deal with registration/certification/etc and hence does not need a scholcomm event log. But it needs another event log that serves another purpose ...

+1

hvdsomp commented 2 years ago

I provided my perspective regarding the "creation" event: The fact that Alice created a document may be of interest to some, e.g. her close collaborators. But from my perspective this is out of the scope the the scholcomm event log. For that event log, IMO, it all starts with Registration, which is the entrance of the document in the scholarly record. Again, that is not saying that the creation of the document, and edits thereof, etc should not be saved somewhere. As a matter of fact, they might be of interest to generate provenance information regarding a registered artifact, by which I mean some technical metadata detailing the creation/evolution of the artifact prior to being registered in the scholarly record. But, IMO, this is not necessarily part of the scholcomm event log.

hvdsomp commented 2 years ago

I keep thinking about this all. In this comment, I want to think in general terms rather than in terms of the Mellon scholarly use case and the related scholcomm event log. And with this regard I keep thinking about things @mielvds said (e.g. the remove case in NDE) and what @Dexagod said (about the potential role of the Orchestrator in deciding which events are considered worthwhile saving in an event log and which are not. In this, I assume that the Orchestrator is always a "machine in the middle" that is aware of the kind of events that this discussion is about, both events that are considered worthwhile to save in an event log and those that are not. By which I mean, events for which the Orchstrator isn't in the loop are outside of this discussion.

  1. This leads me to the notion of a registry of event types that are considered worthwhile and that could be characterized (e.g. in rules invoked on notifications) by means of (among others?):
    • what is the as2 activity
    • what is the more specific, community-related activity (cf the COAR vocab)
    • who is the sender of the notification regarding the activity (e.g. Alice herself, a service hub, which service hub, which type of service hub, ...)
    • what is the type of artifact Based on this, some notifications would be evaluated as pertaining to events that are considered worthwhile and others will not.

Another aspect of this all is the question "event considered worthwhile for which purpose?" We've already touched upon this in the discussion: worthwhile for transparency of open science scholcom, worthwhile when it comes to recording a creation/update provenance trail about an artifact in a pod, worthwhile regarding workflows in NDE collection registration, etc., etc.

  1. This leads me to the notion of event logging to serve different purposes. Which in turn probably leads to the notion of multiple separate events logs (one per purpose) because likely different applications/users will consume them and will or will not be allowed to access them (ie different access rights for different event logs). The aforementioned registry of event types could also contain the information concerning which worthwhile event goes into which event log and the Orchestrator would then write an event to the appropriate log.

I am not saying that we need to immediately go into this direction. But merely considering the scholcomm registration/certification/awareness/archiving events, the scholcomm interaction events, the artifact evolution events (create, update, delete), and the NDE events it's kind of becoming obvious that one size will not fit all and that event logs will end up having a certain profile related to the purpose they serve.

phochste commented 2 years ago

Technically this is what is currently already possible with the rule language and current orchestrator demonstrators.

The registry are N3 policy files. For the sake of the orchestrator they can be anywhere in the world. The orhestrator only need to know where to get these policies.

In these policies exactly like you describe @hvdsomp it said : from who is the activity and to what Log you want to write them and in what form. For now I assume that all event logs are LDP Containers. What we put in them is our choice.

In general think that Alice could work in different communities where one artefact can have "worthwhile"-ness that are different in the communities X & Y for the same artefact. And maybe a combined "worthwhile"-ness for the community Z that does both X & Y.

E.g. Alice could work on a research about an old manuscript in Digital Heritage land and Schol Communication land (different value chain) but is also a public speaker about this subject for a third community. Are these 3 different logs?

hvdsomp commented 2 years ago

So, that's totally great. In which case, IMO, the whole discussion boils down to the need to express what is technically already possible in more architectural terms. Which is - I think - what I've tried to do.