getavalon / core

The safe post-production pipeline - https://getavalon.github.io/2.0
MIT License
218 stars 48 forks source link

Unify Attribute names #406

Closed mkolar closed 5 years ago

mkolar commented 5 years ago

During a bigger cleanup operation we ran into multiple inconsistencies in avalon-core when it comes to data member names.

In particular some of the key ones.

Maya uses edit_in and edit_out to set frame range, however loader expects startFrame and endFrame

There are a few key attributes without which many functions are either useless or don't work at all. We'd like to suggest unifying these and making them required on assets and project. On top of that adding functions for their retrieval that will make sure that if the attribute is not filled on the asset, the default value from project will be returned instead.

Based on client feedback and past year working with avalon we suggest these to be the required ones

resolutionWidth
resolutionHeight                  

handleStart
handleEnd

frameStart
frameEnd

fps

edit_in and edit_out are problematic, because they overlap with editorial metadata terminology and it became very confusing as soon as we had requests to collect edit specific information like sourceIn/Out and clipIn/Out

We'd also like to split handles to handleStart and handleEnd this is also client request coming from live action VFX show, where this happens quite often. Making this backwards compatible is trivial so I suppose it won't be an issue.

The questions is whether key attributes like this should remain in data or should be promoted to asset directly

tokejepsen commented 5 years ago

The questions is whether key attributes like this should remain in data or should be promoted to asset directly

Is this also a question of whether its required or not?

BigRoy commented 5 years ago

Thanks! Those are great thoughts.

Clip in/out

edit_in and edit_out are problematic, because they overlap with editorial metadata terminology and it became very confusing as soon as we had requests to collect edit specific information like sourceIn/Out and clipIn/Out

Actually that code I've personally never used. Couldn't this data in Maya just not rely on frameStart and frameEnd too? Also, is anyone actually using that? Or why should this be some other piece of data separated from start frame and end frame?

Can we drop edit_in and edit_out completely? Sounds like they are deprecated and old remainders, unless someone else is using them.

startFrame vs frameStart naming convention

Personally, I'm more inclined to startFrame and endFrame but I can see how keeping frame as prefix is sensible next to the other options. Guess I'm just used to the old.

I'm fine with both. We'll just need to figure out how to keep backwards compatibility. Do we fall back to startFrame if frameStart is not found?

Optional data instead of required data

Also, all of this data I'd not make required but optional - make sure your code fails nicely. This then automatically maintains some sort of backwards compatibility without breaking things. Right?

Hierarchical retrieval of data like startFrame, etc.

On top of that adding functions for their retrieval that will make sure that if the attribute is not filled on the asset, the default value from project will be returned instead.

Note that this can be very slow. My recommendation would be to think about this the other way, when setting/creating/changing some value for resolution, start frame and end frame automatically assign it downstream to the assets too. That way, you wouldn't need to do multiple queries for each asset. Could that be a viable option?

Querying upwards...

A correct hierarchical traversal should go up through visualParent through to project. If you would only decide to do a Project query as fallback then you'd get to something like this:

  1. Get asset key/value 1b. To make sense to the user you should actually recursively (or in while loop) go through the visualParent to see if any parent in the hierarchy has the data.
  2. If not on asset, get project key/value.
  3. If not on project, return default value.

If you separate out functions for startFrame, endFrame, etc. for the retrieval of the data you'll end up doing lots of queries to the database. Worst case scenario to the asset, through a bunch of parents to the project to only return a default value. That could be over 3 queries per key/value per asset to get a single value.

Best case scenario you'd once query all data of asset + parents + project and in Python get the value from where it's found?

These kind of queries I believe are more optimal in MySQL database, yet for MongoDB I believe it's more sensible to embed the values in the documents - since that's what they end up being related to anyway.

MongoDB does allow aggregation but still you will suffer performance plus queries will get more complex.

Storing downwards...

When storing downwards it means you'll only have to do the traversals whenever you change the resolution or specific settings. This you'll much likely do much much less, plus it's very explicit on the asset. So, could it be an option @mkolar?

Potentially it could of course "ask" whether you'd like to apply downstream or not. If not, then your downstream assets would keep their old values, remaining unchanged. Then only newly created assets would get the new upstream defaults.

Split handles into handleStart and handleEnd

We'd also like to split handles to handleStart and handleEnd this is also client request coming from live action VFX show, where this happens quite often.

πŸ‘

Data or directly on asset?

The questions is whether key attributes like this should remain in data or should be promoted to asset directly.

I would vote for data. For me they are still metadata of the asset + also not required to be a functional asset.

Is this also a question of whether its required or not?

@tokejepsen not necessarily, you can have optional keys in the schema even outside of data. But I do like to see that things in data are definitely optional. So, technically it doesn't have to be. But in reality, yes I think it's a helpful differentiation for most cases.

Note that some elements require specific values in data for example versions. I think that's the only one. And note that families might move over to subset, see #246

tokejepsen commented 5 years ago

This might be for a different issue, but we should maybe validate the naming convention of the data members?

Currently we have snake case and camel case, which does add to the confusion for newcomers. "Was it edit_in or editIn, frame_start or frameStart".

BigRoy commented 5 years ago

It should be camelCase. I believe that's the same as what is recommend for data in Pyblish. However, I can't quickly find the recommendation... ;)

However, I wouldn't limit what "custom data" someone wants to put into the database, e.g. disallowing someone to store my_own_custom_key will only have people start asking questions and a lot of discussions. I think we just need to make it super clear what data is default data, which will always be camelCase, and what data that should be and what it is used for as I started in a little wiki page

tokejepsen commented 5 years ago

@BigRoy just a headsup in that case batch_id should be batchId?

BigRoy commented 5 years ago

just a headsup in that case batch_id should be batchId?

Changed in that wiki. :) Thanks.


@mkolar any thoughts on the above discussion? And @davidlatwe are you using edit_in and edit_out? Are you also adhering to camelCase for your data variables?

mkolar commented 5 years ago

Personally, I'm more inclined to startFrame and endFrame but I can see how keeping frame as prefix is sensible next to the other options. Guess I'm just used to the old.

We feel quite strongly that to make it consistent we should go with frameStart, frameEnd, handleStart, handleEnd, resolutionWidth, resolutionHeight and so on. They organisz together nicely and it's just a lot clearer

Also, all of this data I'd not make required but optional - make sure your code fails nicely.

I don't really have problems with that, but keep in mind that there are tools in vanilla avalon that work with this data and if you don't have it on the asset, they simply do nothing, which is really confusing to the user. We can of course validate on pype side, but if there is set resolution button in avalon gui, I'd expect it to always set resolution to what it needs to be, rather than possible silently faliling and not doing anything. That give user false sense of security.

My recommendation would be to think about this the other way, when setting/creating/changing some value for resolution, start frame and end frame automatically assign it downstream to the assets too.

this is what we have right now and it works fairly well. still I think that have a sensible fallback on the project level would make things much easier on management. Not a biggie though.

I would vote for data. For me they are still metadata of the asset + also not required to be a functional asset.

I'd argue that shot without fps and framerange is not a functional asset as you don't know what data you're supposed to be creating ;)

mkolar commented 5 years ago

we definitely vote for camelCase for recognized attributes. Here's list of the attributes we suggest should be recognised and reserved globaly in avalon

fps, frameStart, frameEnd, handleStart, handleEnd, resolutionWidth, resolutionHeight, pixelAspect

We tried cleaning all of avalon and pype to use these attributes on the hackaton this week and suprisingly it wasn't such a massive job and it's all working smoothly on our end now. Maya, nuke, loader all using the same attributes. We did have to write a script that converts all the previous attribute possiblities to their new counter parts though. Needed to be ran once on a DB to make it backwards compatible.

davidlatwe commented 5 years ago

are you using edit_in and edit_out? Are you also adhering to camelCase for your data variables?

Yes, I am using them but frameStart, frameEnd would work, too. And yes, camelCase definitely :D

BigRoy commented 5 years ago

fps, frameStart, frameEnd, handleStart, handleEnd, resolutionWidth, resolutionHeight, pixelAspect

All seem fine with me, as long as they are not required. πŸ‘ That's to maintain backwards compatibility plus allow simple setups of a project/asset (without this data) to continue.

I don't really have problems with that, but keep in mind that there are tools in vanilla avalon that work with this data and if you don't have it on the asset, they simply do nothing, which is really confusing to the user.

First off, these things should be documented way clearer. E.g. like Project/Asset data and Published Data but in an easy findable spot in the original Avalon documentation.

but if there is set resolution button in avalon gui, I'd expect it to always set resolution to what it needs to be, rather than possible silently faliling and not doing anything. That give user false sense of security.

Agreed, if they are there and they don't work then it should not fail silently and should describe what is wrong with the asset/project-data to allow someone to fix it. I'd still say they are not required, it should just be clear to the end user what's up.

this is what we have right now and it works fairly well. still I think that have a sensible fallback on the project level would make things much easier on management. Not a biggie though.

If I understand correctly you're currently using a setup that already propagates the data downstream. Again, it's a good recommendation to keep queries simple and optimized for solely the asset's settings (especially if you'd also want to traverse parent hierarchy of the assets). :)

To be able to set it like that should become part of core's features so it's trivial to do so.

I'd argue that shot without fps and framerange is not a functional asset as you don't know what data you're supposed to be creating ;)

Potentially, yes. But we also consider a single visual a "still" or even the asset (e.g. a model) is also an asset, exactly the same in the avalon database as a shot.

We have run almost 60 projects now with Avalon and I think in the first 75% we weren't even setting the FPS in any data in the project (because we got lucky with a consistent default FPS I suppose). Again, it's crucial data once you need it but I'd still say it's not required for an asset (or even project) to have values for it as long as you as a studio have some default you can trust.

We did have to write a script that converts all the previous attribute possiblities to their new counter parts though. Needed to be ran once on a DB to make it backwards compatible.

That doesn't sound like backwards compatible. but like requiring to update the database. ;) For me personally, I'm fine as I now know I need to run that as I update. However, it shouldn't be that way.

Technically the database should allow to remain as is and things should e.g become:

frame_start = data.get("frameStart", data.get("startFrame", None))
frame_end = data.get("frameEnd", data.get("endFrame", 0))
handle_start = data.get("handleStart", data.get("handles", 0))
handle_end= data.get("handleEnd", data.get("handles", 0))

In short,

mkolar commented 5 years ago

Yes, let's refactor the names to frameStart, frameEnd, handleStart, etc. No, keep them as optional data, not required. Yes, hierarchical data is sweet. Let's set values downstream as opposed to querying upstream.

I'm fine with all those points. One comment on hierarchical data though. We suggested only falling back to project if its not found on asset, because avalon in itself doesn't recognize hierarchy and it's questionable whether it should.

We use deep hierarchy for projects, but I also love the fact that if need be we could build a UI that would show all the assets in a completely different light. The key point I think is, that if we start supporting hierarchy with attributes, than that hierarchical structure must become core requirement of avalon. Currently this is done by visualParent which is optional data, but if we start traversing it for other things that just visual cue in the UI, shouldn't we make it required?

That would also make transition from using silos to purely visualParent much easier. (we have that code running and would just need cleanup) @iLLiCiTiT has already started preparing PRs

mkolar commented 5 years ago

Let's set values downstream as opposed to querying upstream.

How should we go about that. in pype we have the advantage of running event server that catches changes on parents and propagates them downwards. But avalon doesn't have anything like that.

Also this is not trivial to do architecture wise. For example. When you set FPS on the sequence, you have no way of knowing what children to propagate it to, because they all have values from before and you can't be sure which of those values have been set explicitly and which have been inherited. That's of course unless you somewhat tag the ones that have override on them.

Ftrack deals with this by doing the upstream querying. If asset doesn't have value on an attribute, look through the parents until you get a hit and use that. We ran into this issue when we were doing hierarchy attribute sync from ftrack to avalon, where it's impossible to tell what is inherited value and what is explicit

BigRoy commented 5 years ago

The visualParent I think can always still fall back to being None. In a way you'll unfortunately have to to remain completely backwards compatible with any ancient project. You can see it as a requirement however it really isn't since it would break the backwards compatibility for projects where it wasn't required, right?

So we must consider an asset that does not have visualParent to just be a top-level asset. And with the change to drop out silo then any asset without silo and visualParent specified becomes a top-level asset. (Otherwise the silo would still be a visual parent asset in the UIs.)

mkolar commented 5 years ago

The visualParent I think can always still fall back to being None

true

BigRoy commented 5 years ago

Also this is not trivial to do architecture wise. For example. When you set FPS on the sequence, you have no way of knowing what children to propagate it to, because they all have values from before and you can't be sure which of those values have been set explicitly and which have been inherited. That's of course unless you somewhat tag the ones that have override on them.

Good point. I'm not saying it is completely trivial for those situations. It would still be trivial when syncing from Ftrack to Avalon (since you can rely on Ftrack having the correct data I suppose).

And with Avalon. The idea is that once you start managing this data hierarchically with Avalon that the UI clearly shows what "children" would receive a change (e.g. by default maybe the ones that still had the same value as the parent) where one could just review whether the recursive change is as you'd want it and tick on/off which ones they actually want to update. I think this is resolved with good UX-design.

I've tried the upstream querying on one of our larger projects and it was just sluggish whenever I queried the values for all assets. But then I again, we might be able to optimize that too. Or only fall back to project as opposed to querying through all parents. Then you could get it once and use it for all assets making it just one additional query as opposed to 1 per asset.

Maybe it's best to see a prototype regarding either one of the two and discuss pros/cons and consider it a different issue from unifying the attribute names. :)

mkolar commented 5 years ago

Maybe it's best to see a prototype regarding either one of the two and discuss pros/cons and consider it a different issue from unifying the attribute names. :)

agreed