Closed kwokcb closed 1 year ago
Thanks for putting this together @kwokcb , I think this makes graph connection more clear.
1: Fallback value for connection
I know we consider an input with a value and a connection as ambiguous state, but would like to suggest that we allow this for "fallback" scenarios.
e.g.
<input name="in1" type="color3" connection="node1.out" />
could be authored as:
<input name="in1" type="color3" value="1,1,1" connection="node1.out" />
There can be a valid case where a renderer may choose not to process a connection for performance reasons or it cannot process the connection because the node cannot be resolved or the asset used in the node cannot be resolved. In these cases the render can use the value as a fallback. The author of the MaterialX can publish a suitable fallback value. Note that fallback value is not the same at the nodedef default.
2; Channel identifiers Instead of using a numbered index e.g. .albedo[0], could we use named channels e.g. .albedo.r I think this would make it more clear and is probably more aligned with USD usage.
Thanks for putting this together @kwokcb , I think this makes graph connection more clear. 1: Fallback value for connection I know we consider an input with a value and a connection as ambiguous state, but would like to suggest that we allow this for "fallback" scenarios. e.g.
<input name="in1" type="color3" connection="node1.out" />
could be authored as:<input name="in1" type="color3" value="1,1,1" connection="node1.out" />
There can be a valid case where a renderer may choose not to process a connection for performance reasons or it cannot process the connection because the node cannot be resolved or the asset used in the node cannot be resolved. In these cases the render can use the value as a fallback. The author of the MaterialX can publish a suitable fallback value. Note that fallback value is not the same at the nodedef default.
2; Channel identifiers Instead of using a numbered index e.g. .albedo[0], could we use named channels e.g. .albedo.r I think this would make it more clear and is probably more aligned with USD usage.
Both are useful, but probably beyond the scope of this issue so think we'd want to spawn them.
More upfront reasoning has been added as well examples using new syntax have comparison with old syntax.
Adding comments in slack thread here:
Nick Porcino The proposal makes sense to me, and I was able to successfully puzzle out answers to all the mental questions I formed while reading it. It took some thought to deduce the motivation though! The final sentence hints at it ~ "A lot of special case code should hopefully be removed" I was wondering if the opening paragraph could lead with motivation, and perhaps give a concrete example or two of the special cases that can be removed, or perhaps pathological constructions that can be avoided with the new structure? In the spirit of continuing to line up USD and MaterialX, I've got two other questions, which I think would be nice to see addressed in the proposal itself. I'm wondering if it might be possible to add any notes on possible repercussions to UsdShadeGraph interoperability, even if the impact is trivial and in that case to note that "it's transparent"? I'm assuming the impact on Hydra rendering would be nothing, in the case of a pass through, since Hydra will just ask MaterialX to do codegen as it has done historically
Doug Smythe With a proposed change this large to a core part of MaterialX documents, I think it's extremely important to demonstrate in the proposal what clear benefits it provides, and/or what real problems with the current connection mechanisms are solved by the proposal. While I've been in discussion about improving aspects and syntax in this proposal, I must admit I'm not personally convinced yet that it's a big enough improvement to warrant such a backwards-incompatible change in syntax, but I'm very open to a well-reasoned argument in favor of the change.
Doug Smythe As for ".albedo.r" rather than ".albedo[0]", I would say that the recently-introduced numeric channel indices have the advantages of 1) being consistent for both color and vector types, and 2) we've already switched from
strings and string channel names to / channel numbers for ease of parsing and implementability in shading languages that don't support string operations (e.g. "most of them"). Ashwin Bhat Regarding ".albedo.r" vs ".albedo[0]", is there an assumption that .albedo[0] should return the r channel or is it whatever the implementation chooses to use for channel 0? Bernard's proposal calls out an important point: It is not valid to have both a value and a connection specified on an input. It would be good to get some clarity on this area. Even in the current spec is the following an invalid input? input name="base_color" type="color3" value="0.5,0.3,0.2" nodegraph="NG_BrickPattern" output="base_color_output" /> The author's intent here is, base_color is the output of the node graph, but a renderer can choose to skip the nodegraph evaluation if needed and use the value as a suitable fallback. (edited)
Doug Smythe For the "It is not valid to have both a value and a connection specified on an input" point, it was always originally the intent that an input specifies a (single) connection or a value, but never both, and never more than one connection type (e.g. to both a node output and to an interfacename). I'm open to whatever clarification of this idea the community agrees to, but I think we all agree that an explicit clarification of what is allowed and what the order of precedence is if multiple input values/connections are allowed would be a very desirable thing to have. For the assumption that .albedo[0] should return the "r" channel, yes that is indeed the assumption but you make a very good point that the in-memory representation of a multi-channel value could be in some other order than "r,g,b,a" or "x,y,z,w". If we do stick with channel numbers, we should make it clear that any implementation that doesn't use "r,g,b,a" ordering will need to remap values so that e.g. .albedo[0] does in fact always mean the "r" or "x" channel.
@kwokcb There are some compelling ideas in this proposal, and I particularly like the use of a connection
attribute to represent connections between nodes. On the other hand, I believe some of the ideas here would have the unintended side effect of reducing developer clarity and performance, e.g. the proposed dot/bracket separators for outputs and channels, which would prevent developers from accessing or modifying these properties individually without parsing and recomposing the entire connection string.
My recommendation would be to propose these ideas incrementally, so that each of the recommended improvements can be considered on its own and discussed by the MaterialX TSC. We may find that a subset of these ideas are robust enough to proceed with, but they will need to meet a very high bar in terms of the benefits they would provide to developers and users, especially given the recent rise in the usage of MaterialX 1.38 in the industry.
Shader Graph Connectivity
1. Introduction
This is a proposal to update the way that connections are specified in MaterialX.
Motivating factors include:
Providing a single unified syntax and API entry point to specify and modify an arbitrary connection. Currently a user must take into account numerous (up to 6 different) attributes and an equivalent number of API entry points to determine what a single connection represents. Special casing is required for different subsets of attributes depending on what is upstream and downstream of a connection. This includes handling:
Formalizing and simplifying connection rules to support more robust connection validation and avoid invalid data. Currently for proper validation of all existing connections multiple attributes must be checked for consistency. To date it is still very easy to create data which is inconsistent and thus invalid as each attribute entry point can be independently set. As the validation process is expensive, it is either not part of the connection modification logic or not for all entry points and can thus be skipped. This results in storage and transmission of existing or new documents which are invalid. As there is no known logic to make things valid, this results in the requirement to hand edit files assuming the user has sufficient knowledge to know how to do so.
Simplification of traversal logic used for workflows such as graph editing or code generation by removing special case handling. The only traversal which is mostly complete is that used for code generation which must create a parallel graph with simplified connection storage and interfaces mirroring the proposed single specification point. There is no other published way to easily parse for connections. Note that it is still possible to specify a connection which is ambiguous when traversing through interface connections resulting in traversal failure and currently code generation failure.
Hiding of storage complexity and single point of interaction. Much of the syntax and interfaces expose the underlying implementation details. While it is possible to wrap up the existing complexity with new interfaces the existing interfaces and data attributes still exist allowing for the same integrity issues. With their deprecation, the hope is provide a single "future-proof " interface which is reflected using a single syntax and storage location.
(*) From a standards consistency point of view a single data syntax is more consistent with how USD stores and handles connections (See implementation details at end of document).
Examples
node
attribute)node
attribute needs to be removed and thenodegraph
attribute added. This was only recently fixed in a current patch release.node
,nodegraph
attribute need to be removed and aninterfacename
attribute added. A reference to a nodegraph input and a node input are indistinguishable from each other (as the syntax is underspecified ) and cannot be properly parsed, or the incorrect connection found.output
was specified for the node connection then it needs to be removed, or add if the new connection requires an explicitoutput
to be specified as it has multiple outputs. It is possible to specify anoutput
without a upstreamnode
ornodegraph
resulting in an invalid state.channel
specification which cannot exist on it's own but is possible to set. Again resulting in an invalid state.node
,nodegraph
,interfacename
,output
, andchannel
need to removed to avoid an ambiguous or invalid state.To understand the proposal basic background is provided first. A second purpose describe the existing connect variants For the most part each variant requires different syntax and special case handling.
For those who already understand how connectivity variants works, please skip to section 3 (Proposal).
2. Background
2.1 Shader Graphs Nodes and Nodegraphs
A
shader graph
is composed of various combinations of<nodes>
and<nodegraphs>
elements which can be connected together via child<inputs>
and<outputs>
.A
<node>
is atomic and is not represented by other nodes. Each node has a unique name. A valid name contains only alphanumeric characters and cannot contain a path separator:/
delimiters such as "." and braces.Both
<nodegraph>
's and documents are categorized as a graph elements meaning they can contain a graph of nodes.} class GraphElement { +string name +nodes[] } class NodeGraph { +inputs +outputs } GraphElement <|-- Document : implements GraphElement <|-- NodeGraph : implements
<input>
s or<output>
s.2.3
<output>
to<input>
ConnectivityBased on the definition of scope, the rules for connecting
<output>
s to<input>
s is as follows:<output>
of a node can be connected to the<input>
of another node or nodegraph if they are of the same type and in the same scope.<output>
of a nodegraph can be connected the<input>
of another node or nodegraph if the inputs are of the same type and in the same scope.<node>
or<nodegraph>
has more than one<output>
then the connect must be to one of those outputs, otherwise the<node>
or<nodegraph>
can be used to specify the connection syntactically but logically it is a connect between an<output>
and an<input>
<output>
s are defined as an attribute of an<input>
.<input>
<channel
> syntax.<channel
> syntax if it is on a nodegraph.Below shows an example of the possible valid pair-wise connections.
2.4 "Interface" Connectivity
The unique interface "signature" of a nodegraph is defined by it's inputs and outputs. Though it is possible have no interface inputs nor outputs this would be of no practical use as no data can flow through the graph.
Interface inputs and outputs can be connected to direct child nodes within the it's scope. We use the term "interior" to denote that a node is within scope.
A
<nodegraph>
<input>
may be connected to one or more interior node's<input>
within the same scope. ( In this and following examples, inputs are color coded. )An interior node's
<output>
s may be connected to one or more nodegraph outputs within the same scope. ( In this and following examples, outputs are color coded. )As nodegraphs can be nested within other nodegraphs, the same rules apply to connect to interior nodegraphs.
2.5 Invalid Configurations
3 Proposal
3.1 Existing Connection Syntax
Currently connections are specified on an input using addition attributes:
node
if connected to an upstream node outputnodegraph
if connected to an upstream nodegraph outputinterfacename
if connected to the input interface of a nodegraph.output
This must be specified if the upstream node has one or more outputs to avoid ambiguity.channel
Based on the current 1.39 proposal a numeric channel can be specified for an given upstream output.3.2 New "Connection" Syntax
The proposal is to replace to replace all existing connectivity attributes with a single attribute named
connection
.The assigned value will be a reference to upstream connection.
A path reference is always specified relative to the current scope. Thus, absolute path and parent references are disallowed. i.e. Paths starting with a
/
and is../
.Inputs and outputs of a node and nodegraph are prefixed by a
.
(dot)<node name>.<input name>
A path reference may be to an upstream node or nodegraph without explicitly specifying its output if and only if the node has only one output. Otherwise the reference is considered to be ambiguous and fail validation.
<node name>
vs<node name>.<out name>
Inputs which currently reference an interface input using
interfacename
will use a path notation of this form:.<input name>
Note that this means that nodegraph inputs can be connected to node inputs. This makes it orthogonal to how node outputs can be connected to nodegraph outputs. These type of connections are still considered to be invalid if specified outside a nodegraph.
The 1.39
channel
attribute (and the existingchannels
attribute are represented as part of the path notation using an array notation:[#]
.where
#
is the channel number 0..N. Where N+1 is the number of channels.<namespace>
can still be supported as part of a node name (as it currently is now). That is, it is still allowed to connected between nodes in different namespaces.3.3 Examples: Connections Using A Common Parent
The following are some examples of connections where the connections occur within the parent.
New syntax:
New syntax:
New syntax:
New syntax:
New syntax:
New syntax:
3.4 Example: Connections Between Nodes and Nodegraphs
This example shows:
Existing syntax: (Note the usage of 3 different attributes to denote a connect).
New syntax:
4. Integration and Implementation Notes
The primary messages are that:
(*) If integrations are directly parsing string attributes instead of going through an interafces, there isn't much that can be done and in general these will break regardless of what enhancement is made to attributes. e.g. Integrations use
getAttribute()
to poke into attribute values.4.1 Path Representation
All connections are currently stored as raw strings with no validation. It is possible to change this storage to be a
NodePath
which can encapsulate all the current string manipulation and add syntax validity checking.Note that there is the opportunity to consolidate some of the string handling logic with
FilePath
andGeomPath
to reduce duplicate logic.4.2 Connection Representation
The proposal is create a formal definition of a
NodeEdge
with associated of interfaces. Any connectivity rules can be enforced via these interfaces.These interfaces can be added to existing Port classes. Storage can be part of the Port or externalized into a cache if this is a better approach.
A base set of
PortElement
interfaces includes:An
Output
can extend this to include these interfaces:A
NodeEdge
can have these interfacesconnection
attribute.Any Node references can be obtained using the existing PortElement interfaces which return the parent of the Port.
The conversion from a
connection
attribute (NodePath
) is performed only as required to make all connections. Any invalid / non-existentconnection
attributes are not affected. An "invalid" NodeEdge can be returned if (missing source). TBD if this should be allowed.A
connection
attribute string can be extracted for Document save, or whenever desired.4.2 Traversal
The current traversal iterator (
GraphIterator
) would have the internal logic modified to use the new interface. The existingEdge
information which only provides downstream information can be replaced / enhanced to supply any information from aNodeEdge
.4.3 Upgrade / Legacy Interfaces
4.3.1 Attribute Storage
4.3.1 Connection / Attribute Synchronization
4.3.2 Existing Interfaces
interfacenam
,node
,nodegraph
,channels
would be deprecated.GraphIterator
GraphIterator
s the change should otherwise be transparent.4.4 Future Features
This scheme will allow for support of nested nodegraphs which would be require even more special case logic.
This is in large part due to moving from a Node/NodeGraph/Document connection model to a PortElement connection model. One thought to make connection interface common for Node/Document/NodeGraph would no longer be required.
Validation and unit tests would require some update.
4.5 Consistency Case 1: USD Integration
The new connection syntax is closer in alignment with USD connection syntax. It is worth noting MaterialX pathing remains relative and that channel extraction is part of the path syntax. The latter does not add any new complication to interop, and how channel specification is handled is covered by a another issue (outside this scope of this issue).
There no is affect on input value vs connection logic as defined in USD (as both are still not allowed in MaterialX).
For import to USD, as long as the query interfaces remain to provide connection information there should be no affect here. For workflows which require export from UsdShade networks, then it will be useful to provide a single black-box "connect" interface -- which would also future proof any such conversions from underlying syntax details. For backwards compatibility existing connection set APIs could still be used as noted and just marked as deprecated.
Simple example taken from USD schema page:
would translate to something like this.