getavalon / core

The safe post-production pipeline - https://getavalon.github.io/2.0
MIT License
218 stars 49 forks source link

Workareas #162

Open mottosso opened 7 years ago

mottosso commented 7 years ago

Goal

Implement the Rythm & Hues concept of "workareas".

https://vimeo.com/116364653

Table of Contents


Motivation

Currently, every asset is encapsulated by a single top-level directory in which both development and public files reside. That's great, as it means there is never any duplication of an asset anywhere on disk and related material is easily found.

What R&H have done however is take this concept one step further.

They've managed to encapsulate not only the asset, but also application dependencies to any task - such as rigging their tiger - into a single folder. This folder then only contains things related to rigging the tiger.

image

└── ProjectFolder
    └── workareas
       ├── 1000
       ├── 2000
       ├── PiPatel
       └── RichardParker
           ├── modeling
           ├── lookdev
           └── rigging
               ├── v01
               └── v02
                   ├── input
                   ├── output
                   ├── nuke
                   ├── houdini
                   └── maya

Things to note

  1. Workareas contain published content
  2. Workareas contain copies (symlinks) of loaded content
  3. Workareas are versioned, as well as published assets

In practice what this means is that all paths used within e.g. Maya are local to the current working directory and that shipping this directory to another computer is a mere copy/paste. References, textures and caches all reference the local working directory hence no additional logic, conversion or dependency tracking is necessary.

It also means sending a shot or asset to a farm for rendering or additional processing is dead simple.


Relationships

image

Here you can see that assets are actually stored inside of the work folder, in the output/ directory. The same then goes for assets being used within a work folder. It's all local.

image

Relationships are then tracked per workarea, as opposed to between individual assets, the workarea being the top-level component of a project.

image

Space Preservation

What baffled me at first was how they accomplished this without huge amount of data duplication. How can they avoid having multiple copies of the same version of their tiger model in each of the task folders that used it? How do they manage to keep track of updates of this model? This was the genius that enabled this level of specificity and it isn’t complicated.

Note that (1) all assets and shots are stored together under workareas/, (2) tasks are stored directly under a given asset (as opposed to under a dedicated work/ and publish/ folder), (3) task areas are versioned and (4) inside each task there is an input/ and output/ directory. This is where things get interesting.

The output/ contains data produced within a given workarea, such as the rig produced in RichardParker/rigging. To avoid the aforementioned problem of data duplication, these outputs are symlinked into another workarea.

Windows Example

set output=%cd%\RichardParker\rigging\output\default\v001
set input=%cd%\1000\animation\input\rigging
mkdir %output% %input%
mklink /J %input%\default %output%

Result

├───1000
│   └───animation
│       └───input
│           └───rigging
│               └───default
└───RichardParker
    └───rigging
        └───output
            └───default
                └───v001

Note that v001 is symlinked directory into the name of the subset, default. R&H doesn't allow for multiple versions used within the same workarea.

These symlinks are used both to preserve disk space, but also to maintain a physical link between what data goes into a workarea, as well as what goes out of it.

└── output
    └── rigDefault
       ├── v001
       ├── v002
       └── v003
           ├── rigDefault.abc
           ├── rigDefault.skel
           └── rigDefault.ma

The beauty of this system is that now all data is tracked. Anything going into any asset or shot is physically tracked via a filesystem mechanism and workareas are the sole unit of work. All other benefits of our system, such as validating and guaranteeing a level of quality on output and relieving the artist from working with paths directly.


Implementation

In order for this to be made possible, a few things need to happen.

  1. Creation, publishing and loading must be externalised in order to facilitate a workflow this different.
  2. The path to any asset involves more information, see below.

The object model remains unaffected, and the Loader operates on this so loading and publishing assets will also remain unaffected.


Directory Layout

General

Workareas versioned, appplication folders intermixed with input and output folders.

└── ProjectFolder
    └── workareas
       ├── 1000
       ├── 2000
       ├── PiPatel
       └── RichardParker
           ├── modeling
           ├── lookdev
           └── rigging
               ├── v01
               └── v02
                   ├── input
                   ├── output
                   ├── nuke
                   ├── houdini
                   └── maya

Output

Include subset.

└── output
    └── default
       ├── v001
       ├── v002
       └── v003
           ├── default.abc
           ├── default.skel
           └── default.ma

Input

Version mapped directly to subset.

└── input
    └── RichardParker
        └── rigging
            └── default
                ├── default.skel
                └── default.ma


Paths

At the moment, all files are maintained via two compressed directory templates; one for input, and one for output, that vary per project.

work = "{root}/{project}/f02_prod/{silo}/{asset}/work/{task}/{user}/{app}"
publish = "{root}/{project}/f02_prod/{silo}/{asset}/publish/{subset}/v{version:0>3}/{subset}.{representation}"

The current available members are documented in the main documentation and included here for completeness.

Member Type Description
{app} str The current application directory name, defined in Executable API
{task} str Name of the current task
{user} str Currently logged on user (provided by getpass.getuser())
{root} str Absolute path to root directory, e.g. m:\f01_project
{project} str Name of current project
{silo} str Name of silo, e.g. assets
{asset} str Name of asset, e.g. Bruce
{subset} str Name of subset, e.g. modelDefault
{version} int Number of version, e.g. 1
{representation} str Name of representation, e.g. ma

For an R&H directory structure, we need three additional members along with one additional path.

app = "{root}/{project}/workareas/{asset}/{task}/{app}"
input = "{root}/{project}/workareas/{asset}/{task}/input/{input_asset}/{input_task}/{input_subset}/{input_representation}"
output = "{root}/{project}/workareas/{asset}/{task}/output/{asset}/{subset}/{representation}"
Member Type Description
{input_asset} str The input Asset, not necessarily the current asset
{input_subset} str The input Subset
{input_representation} str The input Representation


API

Another benefit to their layout is their workarea API.

Function Description
setup() Building the base level directory structure common to all workareas.
version() Create a new version of the workarea which typically involves saving the current state of the working files and asset subscriptions as well as any required provisioning of the new version.
copy() Used to create templates or as a means to clone a production workarea for debugging. This also handles copying relative asset subscriptions which makes it trivial to set-up a workarea on one shot and copy it to another and have it up and running immediately. This is extremely useful once a workarea is functional. (Looking at you @Stonegrund)
backup() Compressing a specified version of a workarea and moving it to nearline storage.
restore() Oppsite of above
import() Sterilise and prep the incoming asset subscriptions for use within the workarea.
register() Scan the workarea for files exported from content creation software. When found register them as assets as created from the current version of the workarea.
transfer() Transfer the contents of a workarea from one studio location to another. This typically includes asset subscriptions which trigger additional syncing of those deliverables to ensure remote users have everything necessary to continue working.
mottosso commented 7 years ago

Some blue sky thoughts.

The R&H layout requires a few things that is hopefully possible using already existing tools.

  1. input is symlinked into a workarea on load, this could potentially lead to a bloat of connections if the user loads more than necessary or loads something by accident, Is this a problem? I'd imagine we could either have a clean-up process run alongside production, or that an abundance of connections isn't an issue to begin with. The one issue I can foresee is space constraints, but that is an optimisation.
  2. Workareas are at some point locked, such as when they have produced deliverables, like a render. RH did however mention that they do allow an artist working on multiple versions at a time. I'm not sure yet how this fits together with the locking model.
  3. Outputs are made available via the loader, ideally drawn as they are at the moment with no change to the artist.

When making the switch, it's important that the current workflow is maintained. Not only in order to simplify the transition, but also in order to enable opening a previous version of the project for use with the then-current version of the pipeline.

In order to facilitate this completely alternative layout, a few things need to happen.

  1. We need to separate plugins and other workflow related resources from the main pipeline project. That way we should be able to dynamically pick a workflow independently of which version of the pipeline is active.
  2. The members of path templates must be customisable. At the moment they are heavily bound to what members become available from the users choice in the launcher, Can we use the "type" key on each database member as key for the template? How deeply have we coupled the object model to it? and how flexible is the object model? Are there anything other than the path templates that depend on the exact hierarchy of the object model? Yes, the loader is fixed to visualise particular types at particular columns. Is it possible to make this dynamic? Of having children show as a result of browsing a parent? That way we may be able to gain some flexibility in the hierarchy.
io.insert_one({"type": "asset"})

Here already we run into issues, as the RH path template requires a key for things like input_asset and output_asset both of which are independent of the source asset and sensitive to context. In this case, whether the asset in question is being imported or exported. But I suppose that is fine, as it would be created by the loader plugins responsible for getting assets into a workarea and application. The loader is independent of the overall pipeline regardless and could be responsible for defining their own path keys.

This could work. So long as we ensure that anything tied to any particular workflow or directory layout is made externalisable. It'll enable greater use by others, similar to how Pyblish currently enables use by a wide variety of uses due to the low coupling it requires.

In that way, we provide a number of path keys via the core pipeline library, such as what an asset, subset, version and representation is. And on top of this enable dynamic generation via loaders and publishing plugins that act independently to the pipeline.

It's interesting how the requirements of an out of the box pipeline mimic internal requirements, this to me is an excellent case study of how the needs of production dictate an alternative directory structure and how providing enough flexibility would facilitate this.

What is needed for this kind of externalisation to take place? I've already brushed up against it when looking into filling in the config repository. Plugins, loaders are both externalisable, but we ran into issues with default data. The data relevant to Maya required the use of lambdas in order to capture data only available on instance creation. Odds are well need to approach default data in some other way. Likely via the use of Creators, which is also where we will establish the creation of default hierarchies and attributes in Maya. Yes, this is where I think we should go.

mottosso commented 7 years ago

More blue sky thoughts.