Closed mottosso closed 10 years ago
Think it is a good direction, mostly because it will make it easy for people to graph their workflow without programming. And it also encourages much more composable code.
So you'll end up with the same validator for Maya as for other software.
This is not vital to the discussion, but this comment from @BigRoy still seems like magic to me:) Other than using a cross DCC language like FabricEngine, I can't see a validator being used cross DCC.
While on the topic of FabricEngine, this new direction is looking a lot like their visual programming; https://vimeo.com/103517340, but more specific to publishing. Just dont want us to reinvent the "new wheel":)
This is not vital to the discussion, but this comment from @BigRoy still seems like magic to me:) Other than using a cross DCC language like FabricEngine, I can't see a validator being used cross DCC.
Well, consider what information is required by a validator.
Let's start with a simple example, validating naming convention. For this, a validator would only need access to names of nodes. Names are already grabbed from the scene during selection and as such, the validator isn't dependent on anything Maya. This is actually already the case in the current validator for naming convention. As you can see, there are no references to anything related to Maya.
Taking this a step further, what information does a validator need to validate inverted normals? Well, it would need vertex numbers and edge links. (As, technically, vertex numbering and edges are solely responsible for the direction of its corresponding faces, called winding order)
We could hand this information to each instance during selection, just like we are handing it names of nodes. Then, in the validator, instead of referencing the node and its normals information:
if cmds.polyNormal(...) is 'inverted':
...
We access it through attributes provided via selection.
if instance.config.get('normals') is 'inverted':
...
The same is true for any other validator, even for extractors.
Think about what information is required to extract an obj
from a scene.
It isn't too far fetched for a selector to also provide this information, however it would of course introduce a potential bottleneck in performance.
Stepping back just a bit; even though all validators and extractors could be fed information this way, I don't really see this as a viable alternative. However, it does introduce the possibility to provide basic building blocks as validators that can be chained together with custom validators to fulfill the requirements of each production.
Selectors could, for instance, provide for the following information per host:
That's information which is available within most hosts and there are quite a lot of validators that could get built upon this information alone. What we'd have to do, is to provide for a selector per host, which we'd do anyway, and fulfill an (optional) interface to provide for this information. Hosts that live up to these requirements are then compatible with the "universal plugins".
While on the topic of FabricEngine
Hold your horses there, Fabric is cool and all, but we aren't solving on anything already solved by Fabric and piggybacking on it wouldn't get us anywhere nearer our goal. I thought the only reason we're considering Coral and Depends is for their GUI facilities and already solved implementation of a scenegraph as reference for our own, as it's open source and available?
I'd suggest to staying as minimal as absolutely possible until we've got a solid foothold on what we need and how we're using it. At the moment, our needs are far simpler than any of these other implementations.
That's information which is available within most hosts and there are quite a lot of validators that could get built upon this information alone. What we'd have to do, is to provide for a selector per host, which we'd do anyway, and fulfill an (optional) interface to provide for this information. Hosts that live up to these requirements are then compatible with the "universal plugins".
Definitely seems like a nice idea. It just seems like it makes for much less composable code, when it comes to the Selector. But then again we can't entirely escape the hosts specific code, so it will just be a matter of where to take that hit. Using the Selectors might be better, as else the hosts specific code would need to be in both Validators and Extractors. You would need to extend the Selectors, if you had a new Validator that wasn't getting the information provided.
I'd suggest to staying as minimal as absolutely possible until we've got a solid foothold on what we need and how we're using it. At the moment, our needs are far simpler than any of these other implementations.
I agree:)
But then again we can't entirely escape the hosts specific code, so it will just be a matter of where to take that hit.
Yeah, I think so too.
This is how I see it - the host is going to be involved no matter what, and it can either infect each step, like it is now:
______
| |
| Host |
|______|
|
______________________|____________________________
| | | |
___v______ ______v_____ ______v_____ ___v_____
| | | | | | | |
| Selector |--->| Validation |--->| Extraction |--->| Conform |
|__________| |____________| |____________| |_________|
Or its interaction with Publish could start and end with Selection.
______
| |
| Host |
|______|
|
|
|
_____v____ ____________ ____________ _________
| | | | | | | |
| Selector |--->| Validation |--->| Extraction |--->| Conform |
|__________| |____________| |____________| |_________|
The latter is of course an ideal and probably less practical, but I would at least aim for that.
@mottosso, that's some great clean explanations you added there. I think it's up to the studio to decide what kind of implementations they want to make for their plug-ins (selectors, validators, etc.). But personally I wouldn't convert the DCC's meshes (or other 'complex' data) to an output from the Selector, it doesn't add any real benefit. It's trivial to check normals in Maya, but on your own point cloud you need to know the relevant math. Also to implement your own check in your DCC you can often pick from example code how the check should be performed.
I quickly wanted to mention that Depends seems to use a Python library called networkx to define the node graph and its functionality like traversing the graph. All we would have to do is extend/wrap the functionality that we need for our Dependency Graph. I think it would be a very nice library to use if we go for a node graph! What do you think?
I would also recommend going for a name like DAG over just Graph, because I think it's a familiar term for a lot of people.
I quickly wanted to mention that Depends seems to use a Python library called networkx to define the node graph and its functionality like traversing the graph. .. I think it would be a very nice library to use if we go for a node graph!
Yeah, could do. The library seems robust and so does its documentation.
For reference, this is what their implementation looks like:
But personally I wouldn't convert the DCC's meshes (or other 'complex' data) to an output from the Selector, it doesn't add any real benefit. It's trivial to check normals in Maya, but on your own point cloud you need to know the relevant math.
Just spit-balling here, but by gathering vertex information and attributes in Selection, we're essentially talking about serialisation. We're serialising contents of a scene, and then making use of the serialised information in our plugins.
If so, then in the far far future, we could take it further and utilise existing methods of serialising a scene, with vertex information and such, like Alembic or USD. We could skip the whole "write to disk" and merely keep it in memory, like we are now, and use it strictly for our plugins. Never actually exposing the fact that Alembic is involved.
At that point, we could build truly software-agnostic plugins that would be usable by any host with an implemented Selector (a.k.a. Serialiser).
@mottosso and I are still investigating the route of a node-based graph, upon doing so I wanted to raise the following options:
Here's some pros/cons I could come up with:
This graph is visualized by always having a maximum of one input connection per node, but multiple outputs are allowed. Thus you can perform branching though no merging of branches. The input and output types are always of the same type and act as a data container, think of it as the Context.
This graph is recognized by allowing for and visualizing a higher amount of connections. Not the data container Context is transferred but each single attribute type connects to an input of the same type.
This graph is visualized by most nodes having only a single output and input, but the ability of having multiple inputs and outputs (for merging and branching). This allows us to avoid the issue of the single-connection node graph that you can't have merge nodes. The input and output types are always of the same type and act as a data container, think of it as the Context.
I realised that there is some room for ambiguity about the number of connections per node. For example, having a single output doesn't necessarily mean it can't be plugged into many inputs and thus not facilitate branching.
Let me explain.
This, similar to the SOP context within Houdini, or the majority of nodes in Nuke, only allows a single input and a single output, but the output can be plugged into multiple inputs on other nodes.
To us, this could mean plugging the output of SelectObjectSet
into ValidateNamingConvention
into ExtractAsMa
. The would each take what they give; a context. This would certainly be a convenient and easily looked-upon layout.
In Python, it could look like this:
def single_input(input):
return input + 1
Similar to the above, but allowing multiple inputs. Merge is a good example of where this is useful.
def merge(a, b):
return a + b
Off the top of my head, I'm unable to see any of our nodes being mergeable, @BigRoy what are your thoughts on this?
Now we're getting complicated. Consider the equation x + y + z = a
. It takes three inputs - x
, y
and z
- and produces a single output - a
. Then consider the function:
def add(x, y, z):
return x + y + z
Again, three inputs and one output. This is probably what we're most familiar with.
Multiple outputs on the other hand:
def advanced_func(x):
return [y, z]
To be honest, I'm having trouble imagining we'd ever get into a position where this is necessary. Maya does it so it's certainly not unheard of. But it is rare.
Similar to the above, but probably more common and the complexity added by multiple inputs is slight.
To your points.
- Support for branching in the graph.
Branching would be a really great feature to have I think, and is possibly the thing separating nodal workflows, like Maya, from linear workflows, like After Effects. I think branching should be possible with any of these connectivity options.
- Order of processing is very clear (best option is likely depth-first)
Interesting choice of depth-first, I would actually go the other way and say breadth-first. Consider the following graph:
-- ValidateA --
/ \
SelectInstances -------> ValidateB -------> ExtractAsMa
\ /
-- ValidateC --
Depth-first would mean to run SelectInstances
, followed by ValidateA
followed by ExtractAsMa
. I think we would expect all validations to complete, before running extraction.
- The Context would get a deep copy per branch used to further operate with.
That's an interesting point. I imagined the context to remain the same shared object throughout, but deep copying is probably unavoidable. Consider the following graph:
-- ValidateA --
/ \
SelectInstances -------> ValidateB -------> ExtractAsMa
\
-- FilterSelection --> ValidateC --> ExtractAsObj
If FilterSelection
alters the Context, say it removes a few instances, then it would have a side-effect on the context as it entered into ExtractAsMa
. Thus, it would need it's own copy of the context.
Is it possible that each node will have to get their own individual deep-copy?
Separating 'data' in the Context becomes unclear by just looking at the graph. Example given: one (selector) node outputs the meshes (list of objects) and another (selector) node outputs the cameras (list of objects). Both are lists of objects, how do we now down in the graph what list of objects a node operates on within the Context. We'll need to add dropdown menus (comboboxes) so we can select one of the created inputs that exist on the Context.
This may not be necessarily true. If two selectors follow each other, I think it would be reasonable to expect that the output from the last node to have performed both operations, thus outputting both cameras and meshes. Did I get this right?
If so, then in the far far future, we could take it further and utilise existing methods of serialising a scene, with vertex information and such, like Alembic or USD. We could skip the whole "write to disk" and merely keep it in memory, like we are now, and use it strictly for our plugins. Never actually exposing the fact that Alembic is involved.
At that point, we could build truly software-agnostic plugins that would be usable by any host with an implemented Selector (a.k.a. Serialiser).
@mottosso : this is a nice idea, because these are standard formats and widely used.
@BigRoy : Nice write up and really helpful to understand the real basic. I have a doubt that do we have to stick on one type of node-graph to define our system, it should be according to the type of our node that we have to use. eg: selectors , it will use a file to parse and select the item and give the output to validators.
--(file)--->selectors---(filtered file)-->validators.
Correct me if I am wrong!
This is just me throwing stuff out there, but Gaffer might be an option as well for an interface; http://imageengine.github.io/gaffer/
From the discussion in #59 I got thinking about how Conformers would work with this. Conformers would need specific information from upstream like files, user data etc. With the node based workflow would a Conformer needing two data attributes (filePath and userData), have two inputs, similar to a merge node in Nuke? The Conformer node wouldn't be able to work unless the inputs were supplied. Think this is the Multi-in, Single-out I'm referring to.
That is certainly one way of doing it.
The other way, which resembles what we spoke of in #59 is for each plugin to append data to the current "stream of information", that is, the Context. This is similar to how tools like Houdini works; all along the way, vertices pass through each SOP node. A SOP node can modify, remov or append information, such as vertices, vertex colors or velocity etc. The information can then be used by subsequent nodes in the chain.
The benefit of this workflow is less spaghetti wires between nodes and greater encapsulation of data; in Houdini, there is almost always a single connection between nodes, as opposed to Softimage or Maya where there is one connection per "channel" of information. The disadvantage however is that the information isn't as obvious, as it is in Maya.
In Houdini, this is remedied by having a really good inspector window for what data is present within each mesh, each face, each edge and each vertex.
It's a question of how much information should be represented by the graph, and how much should be represented by surrounding tools, such as the inspector window of Houdini for more complex data.
Forgot to add that what I'm referring to, the Houdini-style, is Single-in, Single-out.
Here's some good reference from all node-based GUIs in the world.
This is my favourite. https://www.youtube.com/watch?v=yuR1e1PjU8Y
Saving images and illustrations here: https://github.com/abstractfactory/pyblish/wiki/Flow-Graph
Branching off of #41 to focus the discussion on the alternative, node-based workflow.
Related:
The issue with using Coral as host for our processes is that most of the project is written in C++ and would require us to also provide build instructions for our users, which vary per platform and can get quite lengthy. And as Publish doesn't yet have a need for performance, most of its benefits would go unnoticed.
I've had a quick look at Depends yesterday and it might be a better fit, being pure Python and PySide. It isn't being developed with Windows in mind, but it ran just fine and I implemented a basic "recipe" and node in just an hour or two.
It does however (as far as I can tell) mainly concern stateless processes - in that each node represents a new process which takes arguments as input and produces results via stdout. For us, this would mean launching a new instance of our host per plugin, as each process is unaware of any other process (hence being "stateless") and could thus not utilise each other's, already running process