Open lukaspie opened 9 months ago
I would start with what is an 'event of data processing'. If we talk about a set of steps producing a single new data output, then it makes sense to allow multiple objects which together build this step up.
NXprocess has been designed with a single sequence_id only in the past. Means that implies the processing is a sequence so a much simpler graph than typically used. E.g. if you have a Y-junction where results of two NXprocesses are necessary input to another NXprocess which sequence_id should the inputs (NXprocesses) have? The idea of using NXhistory is essentially stating we wish to describe also such junctions as what they are: A graph with NXprocesses as nodes and directed edges connecting these. This is the essence I support. As most workflows can be modelled as triplets of some input (at least one) is fed to some functor (action/process with some (set) of algorithms happening in this box) and generates -> some (at least) one output
Problem
In the
NXprocess
base class, the docstring says: "Document an event of data processing, reconstruction, or analysis for this data."This suggests that one NXprocess should describe one event of data processing. However,
NXprocess
can at the moment contain multiple ofNXregistration
,NXdistortion
, andNXcalibration
, suggesting that it is possible to have multiple "events" in oneNXprocess
instance. This is somewhat inconsistent and it makes the other fields in NXprocess, which are related to the order of processing (likesequence_index
) hard to consistenly use.My suggestion
In #177, we have introduced the base class
NXhistory
for the description of the history of a physical entity.NXhistory
can hold many ofNXactivity
as well asNXphysical_process
andNXchemical_process
. I propose to extendNXhistory
such that it can also describe the history of processing events:NXhistory base class:
Then, on the app-def level, we can write:
Additional ideas
1) Eventually, the idea would be that every of these base classes (incl. NXprocess) extends NXactivity (via base class inheritance) and gets a timestamp as well as a sequence index to fully describe the chain of events that occurred. 2) There exist the data idea that
NXhistory
is a graph with nodesNXactivity
(and similar). We could make the edges in the graph more pronounced by using/modifying the existingNXgraph_*
base classes. 3) We were discussing about how to describe the sequence of measurement events in the MPES framework (see #173). Maybe we could describe these measurement events as sets ofNXactivity
instances in the future.What do you think @FAIRmat-NFDI/areab?