Closed Ark-kun closed 7 months ago
Hey @Ark-kun
thanks for all the PRs :) I really appreciate this. I will have a look during the next week!
As I said in #1267 I will not add another node type. I see that it could be useful to get started but in the end you will need to put some logic in your node. The nodes are just showcases. I don't recommend to use them in a real application.
Your library has a lot of use even without any logic in the nodes.
For example, I use it to author a pipeline graph that is then sent to an external API for execution. I've also seen people just use this library for graphing.
I still think there is a sizeable usability gap between a single-handle node and multi-handle node.
For example, in the CustomNode sample I see that the positions of the handles are hard-coded using inline hardcoded styles const sourceHandleStyleB: CSSProperties = { ...targetHandleStyle, bottom: 10, top: 'auto' };
. That might not be saleable and won't work well for different node sizes and different number of handles. I think the most convenient part of this PR is automatic handle layout.
P.S. Here is the example of a pipeline graph representation I'm generating. Nodes are mapped to implementation.grapph.tasks
and edges encode task arguments.
name: Xgboost train regression and calculate metrics on csv
inputs:
- {name: training_data, type: CSV}
- {name: testing_data, type: CSV}
- {name: label_column, type: Integer, default: '0', optional: true}
- {name: objective, type: String, default: 'reg:squarederror', optional: true}
- {name: num_iterations, type: Integer, default: '200', optional: true}
outputs:
- {name: model, type: XGBoostModel}
- {name: mean_absolute_error, type: Float}
- {name: mean_squared_error, type: Float}
- {name: root_mean_squared_error, type: Float}
- {name: metrics, type: JsonObject}
implementation:
graph:
tasks:
Xgboost train:
componentRef: {digest: 09b80053da29f8f51575b42e5d2e8ad4b7bdcc92a02c3744e189b1f597006b38,
url: 'https://raw.githubusercontent.com/kubeflow/pipelines/567c04c51ff00a1ee525b3458425b17adbe3df61/components/XGBoost/Train/component.yaml'}
arguments:
training_data:
graphInput: {inputName: training_data}
label_column:
graphInput: {inputName: label_column}
num_iterations:
graphInput: {inputName: num_iterations}
objective:
graphInput: {inputName: objective}
Xgboost predict:
componentRef: {digest: ecdfaf32cff15b6abc3d0dd80365ce00577f1a19a058fbe201f515431cea1357,
url: 'https://raw.githubusercontent.com/kubeflow/pipelines/567c04c51ff00a1ee525b3458425b17adbe3df61/components/XGBoost/Predict/component.yaml'}
arguments:
data:
graphInput: {inputName: testing_data}
model:
taskOutput: {outputName: model, taskId: Xgboost train, type: XGBoostModel}
label_column:
graphInput: {inputName: label_column}
Pandas Transform DataFrame in CSV format:
componentRef: {digest: 58dc88349157bf128021708c316ce4eb60bc1de0a5a7dd3af45fabac3276d510,
url: 'https://raw.githubusercontent.com/kubeflow/pipelines/6162d55998b176b50267d351241100bb0ee715bc/components/pandas/Transform_DataFrame/in_CSV_format/component.yaml'}
arguments:
table:
graphInput: {inputName: testing_data}
transform_code: df = df[["tips"]]
Remove header:
componentRef: {digest: ba35ffea863855b956c3c50aefa0420ba3823949a6c059e6e3971cde960dc5a3,
url: 'https://raw.githubusercontent.com/kubeflow/pipelines/02c9638287468c849632cf9f7885b51de4c66f86/components/tables/Remove_header/component.yaml'}
arguments:
table:
taskOutput: {outputName: transformed_table, taskId: Pandas Transform DataFrame
in CSV format, type: CSV}
Calculate regression metrics from csv:
componentRef: {digest: e3ecbfeb18032820edfee4255e2fb6d15d15ed224e166519d5e528e12053a995,
url: 'https://raw.githubusercontent.com/kubeflow/pipelines/7da1ac9464b4b3e7d95919faa2f1107a9635b7e4/components/ml_metrics/Calculate_regression_metrics/from_CSV/component.yaml'}
arguments:
true_values:
taskOutput: {outputName: table, taskId: Remove header}
predicted_values:
taskOutput: {outputName: predictions, taskId: Xgboost predict, type: Text}
outputValues:
model:
taskOutput: {outputName: model, taskId: Xgboost train, type: XGBoostModel}
mean_absolute_error:
taskOutput: {outputName: mean_absolute_error, taskId: Calculate regression
metrics from csv, type: Float}
mean_squared_error:
taskOutput: {outputName: mean_squared_error, taskId: Calculate regression
metrics from csv, type: Float}
root_mean_squared_error:
taskOutput: {outputName: root_mean_squared_error, taskId: Calculate regression
metrics from csv, type: Float}
metrics:
taskOutput: {outputName: metrics, taskId: Calculate regression metrics from
csv, type: JsonObject}
@moklick I would second @Ark-kun's comment about there being general utility in having generic support for multi-handle nodes.
i would suggest however, that the base implementation be updated to handle multi-handles rather than introducing a separate node type. single handle nodes then become multi-handle nodes where handles()===1
Hey @Ark-kun and @mikeestee
what do you think about adding sourceIds
and handleIds
to the existing node options in order to support multi handle nodes?
Update: sourceIds and targetIds
@moklick
what do you think about adding
sourceIds
andhandleIds
to the existing node options in order to support multi handle nodes?
How would multiple sourceIds
be used in conjunction with multiple handleIds
? In other words, in an Edge, sourceId
points to the node
it connects from and the sourceHandleId
points to an unique handle
in this respective source node
. If the source node
has multiple sourceIds
, what would be the unique identifier for the node? Also would the relationship be 1-to-many or many-to-many between sourceIds
and handleIds
?
In my opinion, I think what @Ark-kun proposed is a good solution with no breaking changes. I would, however, suggest replacing
handles: {
top: { type: "target", ids: ["top_1", "top_2", "top_3"] },
bottom: { type: "source", ids: ["bottom_1", "bottom_2"] },
left: { type: "target", ids: ["left_1"] },
right: { type: "source", ids: ["right_1"] },
},
with
handles: {
source: {
"top_1":{
"TTB":{position:"top", offset:"25%"},
"LTR":{position:"left", offset:"25%"},
default:"TTB"
},
"top_2":{
"TTB":{position:"top", offset:"50%"},
"LTR":{position:"left", offset:"50%"},
default:"TTB"
},
"top_3":{
"TTB":{position:"top", offset:"75%"},
"LTR":{position:"left", offset:"75%"},
default:"TTB"
},
"bottom_1":{
"TTB":{position:"bottom", offset:"33%"},
"LTR":{position:"right", offset:"33%"},
default:"TTB"
},
"bottom_2":{
"TTB":{position:"bottom", offset:"66%"},
"LTR":{position:"right", offset:"66%"},
default:"TTB"
},
},
target: {
"left_1":{
"TTB":{position:"left", offset:"50%"},
"LTR":{position:"top", offset:"50%"},
default:"TTB"
},
"right_1":{
"TTB":{position:"right", offset:"50%"},
"LTR":{position:"down", offset:"50%"},
default:"TTB"
},
},
}
While it's more verbose, the placement of the handles can be configured more explicitly for each layout (TTB, LTR, and even RTL).
Hey @Ark-kun and @mikeestee
what do you think about adding
sourceIds
andhandleIds
to the existing node options in order to support multi handle nodes?
Did you mean source and target ids for the handles?
I like this approach.
Yes, that was a typo. Source and target ids
Hey @Ark-kun
sorry for the late response here .. I know this issue is fairly old, but now that I am working on the next major release I am thinking about adding the node options sourceHandles
and targetHandles
. We could leave targetPosition
and sourcePosition
and add a deprecation warning.
What do you think about this API:
const multihandleNode = {
id: '1',
data: { label: 'some text' },
sourceHandles: [{ position: Position.Right, id: 'source-1-a' }, { position: Position.Right, id: 'source-1-b' }],
targetHandles: [{ position: Position.Left, id: 'target-1-a' } }],
position: { x: 100, y: 100 }
}
The advantage is that we don't need to introduce a new node type but we could handle this inside the default node. More over we would encourage users to specify the handles in the node options. This is also helpful for custom node implementations.
I'm not sure whether the "source" and "target" concepts are flexible enough. Imagine having 10 types of handles with complex connectability rules (e.g. you can only connect output_integer to input_integer or input_any). But there are always only 4 sides.
Out proposals differ in what comes first in the description: In my proposal, the user describes each of the four sides. In your proposal the user describes source and target handles. I think my proposal is more compact. But that's not the most important property.
I just found what I consider an important design flaw in your proposal: In my proposal, the user clearly lists handles for each side in order, but in your proposal, the order of handles on each side is not well defined. Imagine a case where there are both sources and targets on some side. Then it's the order is unknown.
On the other hand, my initial design just does not support this. Each side is either source or target or something else. Perhaps the best design could be:
handles: {
top: [{id: "top_1", type: "source"}, {id: "top_2", type: "target"}],
left: [{id: "left_1", type: "target"}, {id: "left_2", type: "source"}],
}
What's the status on this feature? Would really like to see it merged!
@moklick confused... how did you achieve the exampe on the homepage then?
@moklick confused... how did you achieve the exampe on the homepage then?
Multi handles isnt impossible to do, it just requires you to write more custom code than just using the default node types. You can add as many handles as you want when you do that.
@joeyballentine @airtonix you can add as many handles if you like
https://codesandbox.io/s/sleepy-agnesi-9ktq34?file=/TextUpdaterNode.js
Thanks for the input here but It's not planned to add a multi handle node to the core.
This node type has easily configurable handles:
Example:![image](https://user-images.githubusercontent.com/1829149/122328063-cc92c180-cee3-11eb-99b6-dbd8943b2a33.png)
Fixes #1267