MellonScholarlyCommunication / spec-orchestrator

The implementation requirements for the Ochestrator component.
https://MellonScholarlyCommunication.github.io/spec-orchestrator/
1 stars 3 forks source link

The orchestrator of the data pod can only read inbox + read/append to event log? #3

Closed phochste closed 2 years ago

phochste commented 3 years ago

In the figures it seems that the Orchestrator has only a limited writable zone in the Data Pod of the Maintainer.

All write operations, except for appending to the Event Log needs to be done by a Dashboard. This is shown in section 3.2 where the Orchestorator can append to the Event Log (in step 11) after reading a notification from the Service Hub in the inbox of the Maintainer Pod.

What does this mean when there would be a policy like:

If a Service Hub Announces an indexation of my artifact, 
    then the metadata of my artifact in my Data Pod should be updated with:
          -  a link to the new indexed artifact at the Service Hub.

The Orchestrator can't execute this policy and needs to wait for the DashBoard to come back online and suggest this update to be executed by the DashBoard?

Is this a correct interpretation? I guess that a limited writable zone is just a matter of choice, right? Not a hard requirement? If a maintainer trusts the Orchestrator a bit more, then it could also automatically update artifacts.

Or is the Orchestrator not meant to have this kind of role and only is there to maintain contact with other inboxes? There could be another more trusted application that does updates of artifacts?

Dexagod commented 3 years ago

I have also been wondering what the solution would be in this case. I think either the orchestrator or the called service should have the possibility of e.g. appending lifecycle event information to a resource event log. Because event without the researcher coming online, (and possibly even without making use of an orchestrator), this data should be available for the network.

JWerbrouck commented 3 years ago

To me, it seems beneficial if you could indeed give the orchestrator access rights to more resources; I see it as your most trusted application, at least more trusted than external services to update your data, as it is exclusively tied to your Pod, and has no other concerns.

In an extreme situation, you could even prevent any other apps from writing to your Pod directly, and let their updates/results etc. pass through the orchestrator (which is then some kind of watchdog/gatekeeper) for some final validations. The policy language could support this, in my opinion. Or am I extending the scope of the orchestrator beyond it's envisaged purpose?

phochste commented 3 years ago

@JWerbrouck When the orchestrator is the gatekeeper to all write operations, it would indeed for Solid (at the LDP implementation) stretch the idea as a solution to which in principle any application could write (given enough privileges). I think this scenario is feasible when the Pod has one single purpose and the orchestrator guards that single purpose. This is the typical scenario many use cases (e.g. in current Scholarly Communication applications worldwide). But, in the more general case, a Solid pod could host a wide variety of content (so to speak, your lolcat images and scholarly communication collection). One app guarding a Pod would indeed make it all so much easier..but in case of Solid it streches the purpose.

@Dexagod indeed for me this is also the point. I don't see a difference (in first approximation) between an Orchestrator having a little bit append access rights and an Orchestrator that has a lot of write access rights. But is it needed? Can a lot of the scenarios be done by an Orchestrator that has only read rights and maybe another type of type of application "InboxReader" that is more trusted that can also write. E.g the read Orchestrator can notify all kind of actors when updates happen in the Pod, or can translate updates to the pod into WebSubs. The InboxReader can automate updates to the Pod when the user is offline. I guess that was your idea of a "small orchestrator" with limited capabilities?

Dexagod commented 3 years ago

@phochste The main problem is to find a reliable manner for remote resources to link information to a resource on your pod I feel, and make this linking visible to external actors in the network.

In the case of Mellon this is for example linking an interaction (comment), a review, ..., to a publication, without requiring the publication uploader to be online. Of course just allowing anyone to append information to any resource is not a feasible solution, so either an application (this COULD be the orcehstrator, but may be any kind of application) could be created that will append this linking information based on the received notification of the linking, OR another approach should be used using external solutions, where e.g. in the case of the Mellon system a review, comment, registration, ... of a publication could be advertised via the awareness system of the network.

So I feel like the questions are:

E.g. in ActivityPub this is handled by the ActicityPub servers.

mielvds commented 3 years ago

So I feel like the questions are:

* Do we want to be able to link to a resource on a pod from an external actor in the network, and make this discoverable for other actors in the network

I'm not sure what you mean here.

* in case we do: Should we make this the responsability of an application managing the pod adding links based on a certain set of rules auomatically, or do we give this responsability to external services if required by the network

I think this is irrelevant. It is important that we only look at this from the protocol perspective: what are the minimal messaging (eg. what possible notification payloads are there? What should be in the event log?) and interface requirements (eg. should be ldp or acitivitypub) of each component needed to participate in the network. Everything else is up to the actors, including whether or not adding links is done by the pod managing application or an external service.

In case an external service is preferred, we should provide the means to offer that service as a Service Hub, but not specify how this should be accomplished (maybe it involves addition access rights to the pod, maybe not). Similarly, we could require that the actor adds the 'link adding event' to the event log or notifies the stakeholders, but nothing more. Bottom line: if there are processes who do not have a direct impact on the network as a whole, we shouldn't be bothered with figuring them out with the exception of using them as use cases.

* in case we do find this responsability should lie with the pod and its managing applications, should this be handled by the Orchestrator, or should this be handled by another application.

see above. The orchestrator, dashboard or other applications should at least do what we define (eg. appending the event log or notifying actors), but they are allowed to do more.

E.g. in ActivityPub this is handled by the ActicityPub servers.