Open bourdakos1 opened 3 years ago
Question:
Will the schema be used for other types of operations..e.g. fileNode=False …like an ethereal operation where a file might be too heavy….say a sleep operation op_type=sleep-node (weak example) or no-code
type operations?
KFP Component yaml: (I am assuming inputs are what would be turned into properties? not sure how outputs would be handled)
The outputs, cant they be handled in the same way as the inputs and exposed as dynamic properties? I think these just end up being passed to the arg cmd driver per component?
Being more a visual person I crayola'ed this in ppt, hopefully, its accurate.
Being more a visual person I crayola'ed this in ppt, hopefully, its accurate.
Yea, that looks about right 😊 I think the only thing is that filename
is not part of Node Schema
it's more of an ?implied? property that only shows up if it is a file node (it will never be explicitly defined anywhere)
Question: Will the schema be used for other types of operations..e.g. fileNode=False …like an ethereal operation where a file might be too heavy….say a sleep operation op_type=sleep-node (weak example) or
no-code
type operations?
yes, all operations will be treated like this by default, "file nodes" are a special case. Also, I think I removed fileNode: true
and changed it to type: "file"
to future proof any other node types that might pop up in the future (I'll update it here too)
KFP Component yaml: (I am assuming inputs are what would be turned into properties? not sure how outputs would be handled)
The outputs, cant they be handled in the same way as the inputs and exposed as dynamic properties? I think these just end up being passed to the arg cmd driver per component?
Oh okay, makes sense, so outputs are just another set of properties? Do all kfp component yamls have a set of inputs and outputs or is this just what they happened to be named here?
Our current solution for handling palette and node properties needs to be overhauled in order to support pipeline 2.0 features.
Current solution
palette.json
on the frontend with a fixed list of available nodes and their associated metadataproperties.json
on the fronted with a single instance ofCommonProperties
properties that is assumed to be use for all nodes in the paletteproperties.json
so ifproperties.json
changes, the validation needs to be updated as wellRequirements
we might want a way to generate a component based on a url?
Proposal
Instead of the concept of a single "palette" and single "properties", introduce the idea of a "node schema" per node type. This collection of "node schema" will be used to generate a "palette" at runtime. A runtime should also be able to provide "properties" that are applied across all nodes or override specific operation properties (RFC #1520). Properties will initially only be validated for type, requiredness, regex, min/max and other various json schema specs. All other property validation should be done at submission time.
A "node schema" should contain all static metadata needed for a node:
op
(the ID for a node type, ieexecute-notebook-node
/execute-python-node
)icon
(svg that will show up on the node)light
,dark
andhigh contrast
versions of the iconslabel
(the default label, since this can be dynamic)description
(the description that could show up in the palette or when hovering on the node in the pipeline)properties
(RFC #1519, the fixed properties for the node. dynamic and runtime specific properties don't go here)A "node schema" should also be able to specify that it is attached to a file:
type
should befile
for nodes that should attached to a fileextensions
(example:[".yml", ".yaml"]
valid extensions for the node. if more than one node spec requests a given extension we should add a dropdown property to manually choose the node's type or prompt the user to choose onDrop)language
(example:"python"
language identifier, not required but useful if available. VS Code has an api to detect the language based on grammar instead of just extension. This will also improve language icon support)filename
)A "node schema" should have a fixed "label" property so that the user can manually adjust the label. The winning label would follow the logic of:
node.properties.label ?? node.properties.filename ?? node.label
Properties (RFC #1519)
The finalized properties of a node will be generated from:
fileNode
is true)Examples
POC for being able to drag and drop any arbitrary KFP yaml component onto a pipeline.
Node schema:
KFP Component yaml: (I am assuming inputs are what would be turned into properties?, not sure how outputs would be handled)
finalized properties based on above: