@pmrv, @JNmpi, and I discussed a bit the similarities and differences between a "project" (e.g. tinybaseProjectAdapter), and Workflow.
Workflow is a dynamic and flexible object used when you're developing your workflow (once it's crystallized you can turn it into a Macro). Workflow is also a parent-most object in the graph context, i.e. it is not intrinsically aware of any other graphs. In their current implementation the semantic path of a workflow always just starts with the workflow label, and then there is a perfect 1:1 correspondence of filesystem directories and the semantic path.
@pmrv pointed out that there are times when you may be jointly developing two or more different "workflows", which are related by some data connection, but where you don't necessarily want to always be re-running the upstream part of the process while you're modifying and playing with some downstream component. One day you might jam them all into a single big workflow that runs top to bottom, but in the moment it can be helpful to keep different development chunks separated.
A "project" may then bring:
The ability to show a semantic connection between multiple workflows
The ability to specify a difference between semantic location and storage location
Tools to easily grab output of owned workflows and make it available to other workflows
e.g. shallow de-serialization of just the workflow output level from storage
A place to specify generic behaviour, e.g. what type of backend to use for storage (HDF, S3, ...)
A connection to the database
In our conversation, the question was whether this fundamentally required a separate Project class, or if extension of the existing Workflow behaviour would be sufficient. We came to the tentative conclusion that Workflow could simply be more empowered. E.g. this pseudocode:
In both cases the resulting workflow has the same semantic path (wf.semantic_root / "test/subdir/foo") and storage location that differs from it ("/usr/some/other/place/foo") and storage back-end (hdf). In the former case, because the full semantic path was given to the project, the wf.sematic_root would just be nothing. More generally, one can imagine in the latter case that wf.semantic_path == wf.semantic_root / wf.label and wf.storage_location == wf.storage_root / wf.label, where the default for both the semantic_root and storage_root is just cwd(), but could otherwise be provided at instantiation or set in a config file.
This is all just some pseudocode, but it shows there is no obvious reason an extra Project class is needed -- the separation of semantic and filesystem paths can be handled right inside Workflow.
Similarly, if we have a database interface (singleton?), this can be slapped onto Workflow and given useful shortcuts just the way Creator is (Workflow.create::Creator(), Workflow.register::Creator().register, ...).
So the tentative plan is to bring tinybase here from contrib (#161), and then slowly start merging in the project and/or job capabilities we need from there into Workflow and/or Node.
@pmrv, @JNmpi, and I discussed a bit the similarities and differences between a "project" (e.g.
tinybase
ProjectAdapter
), andWorkflow
.Workflow
is a dynamic and flexible object used when you're developing your workflow (once it's crystallized you can turn it into aMacro
).Workflow
is also a parent-most object in the graph context, i.e. it is not intrinsically aware of any other graphs. In their current implementation the semantic path of a workflow always just starts with the workflowlabel
, and then there is a perfect 1:1 correspondence of filesystem directories and the semantic path.@pmrv pointed out that there are times when you may be jointly developing two or more different "workflows", which are related by some data connection, but where you don't necessarily want to always be re-running the upstream part of the process while you're modifying and playing with some downstream component. One day you might jam them all into a single big workflow that runs top to bottom, but in the moment it can be helpful to keep different development chunks separated.
A "project" may then bring:
In our conversation, the question was whether this fundamentally required a separate
Project
class, or if extension of the existingWorkflow
behaviour would be sufficient. We came to the tentative conclusion thatWorkflow
could simply be more empowered. E.g. this pseudocode:Could be equivalent to this:
In both cases the resulting workflow has the same semantic path (
wf.semantic_root / "test/subdir/foo"
) and storage location that differs from it ("/usr/some/other/place/foo") and storage back-end (hdf). In the former case, because the full semantic path was given to the project, thewf.sematic_root
would just be nothing. More generally, one can imagine in the latter case thatwf.semantic_path == wf.semantic_root / wf.label
andwf.storage_location == wf.storage_root / wf.label
, where the default for both thesemantic_root
andstorage_root
is justcwd()
, but could otherwise be provided at instantiation or set in a config file.This is all just some pseudocode, but it shows there is no obvious reason an extra
Project
class is needed -- the separation of semantic and filesystem paths can be handled right insideWorkflow
.Similarly, if we have a database interface (singleton?), this can be slapped onto
Workflow
and given useful shortcuts just the wayCreator
is (Workflow.create::Creator()
,Workflow.register::Creator().register
, ...).So the tentative plan is to bring
tinybase
here from contrib (#161), and then slowly start merging in the project and/or job capabilities we need from there intoWorkflow
and/orNode
.