Issue with access to locally stored data

mike1813 commented 9 months ago

In general, processes serialize or deserialize data in one of three ways:

Access via a service when the data is stored remotely.
Access via a service even when the data is stored locally.
Access via the host when the data is stored locally.

At present, the data lifecycle construction sequence tries these options in that order. The idea is that even locally stored daat should be accessed via a service is not unusual. For example:

a data processor may be designed to access data via a service so it doesn't matter whether the data is stored locally or remotely
a host may be configured such that data can only be accessed via a service, which can implement a more complex access control policy or a more comprehensive data access logging policy that could be achieved by the host.

The problem is that this can lead to odd or even dysfunctional outcomes when there is a data service, but a collocated process accesses data via its host, i.e., option (3), above. The existence of the data service leads the construction rules to consider option (2) first, and if the collocated process uses other processes that via some circuitous route connect to the data service, the wrong conclusion is drawn. Worse still, it is possible that no path to the data service may be found, leading to the conclusion that there is a modelling error.

This should be placed under user control, i.e., made dependent on asserted relationships. Henceforth, the rule should be that:

Access is via a service when the data is stored remotely.
Access is via a service even when the data is stored locally, if and only if the data processor and data service have a 'uses' relationship.
Access is otherwise via the host when the data is stored locally.

If none of these can be inferred, based on the process-process connections and hosting relationships, there is a modelling error.

mike1813 commented 9 months ago

Construction patterns fixed, but this doesn't entirely cure the problem. It is still possible for the local process to have connections to processes that eventually lead to the data service. In that situation, data paths are still found to the data service.

To fix this, we need to cut data paths at the point where a process is accessing the data locally.

mike1813 commented 9 months ago

Addressed in branch 103.

This is definitely an improvement on what we had before. No pull request made to merge into branch 6a, because these changes are necessary to complete work in branch 65, so it makes more sense to merge those branches first.

Spyderisk / domain-network

Issue with access to locally stored data #103