Optimize Maya host.ls() method

BigRoy commented 5 years ago

Issue

In large scenes with many nodes the host.ls() query for host Maya is slower than it needs to be. I've seen cases where a single query took 3.7 seconds.

What's changed?

The _ls() method that was used to just list all the relevant avalon containers in maya by node name was relatively slow in the way it queried it when used in large scenes. It previously did a full query on all nodes by an ".id" attribute whereas we should do it only over objectSets as those are the container nodes in the Maya pipeline. This has now been changed to using the Maya Python API 2.0 to iterate over all dependency nodes of type maya.api.OpenMaya.MFn.kSet

Also, it did that full iteration twice, because it also had to look up for the backwards compatibility id pyblish.mindbender.container. This has now been changed to just detect whether the id value is any of the two directly as opposed to iterating the scene twice.

Speed comparison

As I was developing an optimized method I've built some prototype functions as a replacement and compared timings in smaller and big scenes. In all cases I've found that this new implementation was much faster.

In my large test scene the difference was huge:

Current: 3.75600004196
New 1: 0.605000019073
New 2: 0.0550000667572
New 3: 0.325999975204

The newer method being faster also confirmed in a small test scene by @tokejepsen here:

Current: 0.00500011444092
New 1: 0.0019998550415
New 2: 0.000999927520752
New 3: 0.000999927520752

The implementation in this PR is New 2 in those tests.

Fact: Containers are objectSets?

In the Maya pipeline in Avalon containers are created with the containerise function that is exposed in avalon core, this creates the objectSet that encapsulates the nodes that belong to specific loaded data. As such this implementation assumes that all nodes that we allow to be a container actually is of type set because that's what Avalon core provides, plus it's what I've seen (logically) used throughout all available configs.

However, do note, that if someone ever relied on something having the .id on a random node in the scene and get detected as a container that with this change it will not be detected if it's not an objectSet.

tokejepsen commented 5 years ago

Think assuming objectsSets as containers is fair enough. If this break anyone's pipeline, it will encourage a good discussion about it.

BigRoy commented 5 years ago

@davidlatwe could you give this a test run on your end, including testing whether "nested" containers still work as expected for you? I've tested these cases including with scrambled namespaces as I knew that could be a use case on your end (as such I kept the other optimization tests with cmds.ls to do recursive querying). But since it's not an actual production case on our end I was unable to test a production example of that. My expectation is that it should still work, yet better safe than sorry.

mottosso commented 5 years ago

Neat.

Does this mean we can remove lsattr too? Reduce a few lines?
Note that this ups the minimum required Maya version to 2016, where this iterator was introduced
Did you consider cmds.ls(type="objectSet") too?

BigRoy commented 5 years ago

Does this mean we can remove lsattr too? Reduce a few lines?

I believe so. I think lsattr and lsattrs in colorbleed.maya.lib are now unused. Will check on Monday.

Note that this ups the minimum required Maya version to 2016, where this iterator was introduced.

Good point. If ever needed we can add a fallback to cmds.ls (type="objectSet")

Did you consider cmds.ls(type="objectSet") too?

Sure did. But somehow it still was slower in heavier scenes. As if it still tries to filter some garbage stuff. Will do another check on Monday.

davidlatwe commented 5 years ago

Just tested it on my end with a big set dressing scene, there were 1279 containers and the total node count was 96463. (Nothing broken :smile:)

Speed comparison

Before: 5.43400001526 sec
After: 3.57499980927 sec

BigRoy commented 5 years ago

3.57499980927 sec

Interesting that it's still that slow @davidlatwe.

I wonder what the next bottleneck would be in that case. It must be the parsing of the container that uses lib.read. If we allow parse_container to do that itself through the API that can be optimized by a lot too, especially because we know which attributes belong to the container itself and reading only those string values. This would potentially even bring your scene to under a second.

Or, is your slowdown because of a custom config call within ls()?

Should I look into optimizing this further?

davidlatwe commented 5 years ago

Sorry for the delay, had an extra day off due to the weather here :cloud_with_lightning_and_rain:

I wonder what the next bottleneck would be in that case. It must be the parsing of the container that uses lib.read.

I have tested that if I remove the parse_container call, the time drop to less then 1 sec. So yeah, +1 for letting parse_container to do itself via Maya API for the optimization.

Or, is your slowdown because of a custom config call within ls() ?

No, I have changed the code here so that custom call was actually returning an empty dict now. But if the parse_container changed to only read the default attributes defined in schema, then I will need the custom call to get my additional attributes. :relaxed:

BigRoy commented 5 years ago

But if the parse_container changed to only read the default attributes defined in schema, then I will need the custom call to get my additional attributes.

Haha. So in your case it's beneficial that it includes additional data not defined in the schema. What would be the verdict? Would we want parse_container to solely return what is defined as attributes in the schema? @mkolar @tokejepsen @mottosso ?

Option 1)

Leave it as is.

I'm thinking of maybe leaving it as is for now as grabbing only the relevant attributes might be slow too with the backwards compatibility to container-1.0 that has got many more required attributes, of which schema funnily isn't one.

Option 2)

First get the schema version, then get only those relevant attributes. 👍

We could maybe do a check on attribute schema being present, if not then we assume schema container-1.0. It would make the code quite verbose though:


def parse_container(node):

    # psuedocode
    fn = om.MFnDependencyNode(node)

    container = dict()

    # Get container attributes
    plug = fn.findPlug("schema", True)
    if not plug:
        # assume container-1.0
        container["schema"] = "container-1.0"
        attributes = [
                "id",
                "name",
                "author",
                "loader",
                "families",
                "time",
                "subset",
                "asset",
                "representation",
                "version",
                "silo",
                "path",
                "source"
            ]

    else:
        schema = plug.asString()
        if schema == "container-2.0":
            attributes = [
            "schema",
            "id",
            "name",
            "namespace",
            "loader",
            "representation"
        ]
        else:
            # Unknown container schema
            return None

    # Collect the values
    for key in attributes:
        plug = fn.findPlug(key, True)
        if plug:
            container[key] = plug.asString()

    # `objectName` is special, it doesn't refer to an 
    # attribute but refers to the node name.
    container["objectName"] = fn.name()

    if validate:
        schema.validate(container)

    return container

Option 3)

Read the schema version from container and directly get the required attributes from the actual schema. For example:

import avalon.schema

schema = "container-2.0"
attributes = avalon.schema._cache[schema + ".json"]["required"]
print attributes

Which if we add a get() function to schema.py could become:

import avalon.schema

schema = "container-2.0"
attributes = avalon.schema.get(schema)["required"]
print attributes

BigRoy commented 5 years ago

So I did a quick speed comparison with the above Option 3 implemented with this code:

def parse_container(container, validate=True):
    import avalon.schema

    sel = om.MSelectionList()
    sel.add(container)
    dep_node = sel.getDependNode(0)
    fn = om.MFnDependencyNode(dep_node)

    data = dict()

    # Get container schema version
    if fn.hasAttribute("schema"):
        plug = fn.findPlug("schema", True)
        schema = plug.asString()
    else:
        # Backwards compatibility
        schema = "container-1.0"

    # Get attributes from schema
    schema_doc = avalon.schema._cache.get(schema.rsplit(":", 1)[-1] + ".json",
                                          None)
    if not schema_doc:
        # Unknown schema version. Ignore..
        return {}

    attributes = schema_doc["required"]

    # Collect the values
    for key in attributes:

        if key == "objectName":
            # `objectName` is special, it doesn't refer to an
            # attribute but refers to the node name.
            data["objectName"] = fn.name()
            continue

        plug = fn.findPlug(key, True)
        if plug:
            data[key] = plug.asString()

    if validate:
        schema.validate(data)

    return data

I timed it in a scene with 12 containers, and did 100 iterations of ls() with:

import time
s = time.time()
for x in range(100):
    list(pip.ls())
e = time.time()
print e-s

The difference between old and new parse_container:

Old: 1.92000007629
New: 1.77500009537

For a small scene it seems the speed difference is neglible. @davidlatwe could you try giving this a go with your heavy scene? With these current stats I must admit the more complex code here definitely is not worth it.

It actually seems that for many containers most time is spent on the schema.validate in a very light scene, disabling that brings this down to 0.19s for 100 iterations on the scene with 12 containers.

Another speed comparison in a heavier scene with 24 containers and many nodes! running 5 iterations of ls():

**NEW**
New with validate False
0.28200006485

New with validate True
0.430000066757

**OLD**
Old with validate False
0.309000015259

Old with validate True
0.477999925613

Parsing the container through the API doesn't seem to be worth it for these scenes with under a hundred containers. I wonder if it helps more in your scene with 1279 containers @davidlatwe but I think at this stage the more complex code using the API isn't justifiable.

davidlatwe commented 5 years ago

Thanks !

So I have tested with my heavy scene, and :

parse_container via API + validate=True 2.621 sec
parse_container via API + validate=False 0.155 sec
Original parse_container + validate=False 0.713 sec

Looks like disabling validation is the best option here !

BigRoy commented 5 years ago

Does this mean we can remove lsattr too? Reduce a few lines?

@mottosso technically we could, however it's also exposed in __init__.py of the host and as such is technically available in the API. It could be in use by someone, so I'll leave this PR as is and will not remove it. We could leave that up to a future PR if we want to clean up.

BigRoy commented 5 years ago

This PR is ready to be merged. 👍

davidlatwe commented 5 years ago

Actually, could we add validate=True kwarg to ls() so to be able to opt-out validation when parsing containers ?

mottosso commented 5 years ago

What, there's validation somewhere? :O I can't spot it. What kind of validation is happening?

BigRoy commented 5 years ago

What, there's validation somewhere? :O I can't spot it. What kind of validation is happening?

It's only the schema validation not a Pyblish validation. ;)

Actually, could we add validate=True kwarg to ls() so to be able to opt-out validation when parsing containers ?

Sure. Should I just make it ls(validate=True) ? Or does it need to be ls(**kwargs)?

davidlatwe commented 5 years ago

ls(validate=True) would be good :D

BigRoy commented 5 years ago

ls(validate=True) would be good :D

Exposed the validate toggle through ls() with https://github.com/getavalon/core/pull/456/commits/f588483c338928bfc21f4ead2649435a19030328

mottosso commented 5 years ago

Of course it's not Pyblish validation?

Would prefer removing the validation altogether... oh, and you already made the change. Please heavily discuss any change to the API. API changes are forever, ever ever. In this case, there must be a clear reason to add validate=True; what's the usecase? What options have you tried, why didn't those work? When does it happen? Is it a problem or a convenience? Discuss.

Changing internals is easy.
Changing API is hard.

Don't change the API, unless you discuss discuss discuss.

tokejepsen commented 5 years ago

If you introduce an optional validation and it's enabled by default, does that mean we won't get the speed up benefit by using the GUI tools in Avalon?

BigRoy commented 5 years ago

Oh, and you already made the change.

Sure. Agree. ;) Maybe pushing ahead to rapidly with the commit. :) The change wasn't directly meant to be merged but to leave it in a state for the PR discussion to continue, but I guess it does create a bit of "noise" in the commits.

Would prefer removing the validation altogether

@mottosso This could be dangerous if by any chance someone accidentally created a set with the same .id where they didn't intend too. I've had it happen only once but it was due to my own oversight. I guess we can also see that as ones' own care and remove the validation all together. If one wants to validate the container manually then it's trivial to do a validation manually with avalon.schema.validate(container). It's quite a speedup for heavy scenes if it's disabled so I'd be happy to remove it.

If you introduce an optional validation and it's enabled by default, does that mean we won't get the speed up benefit by using the GUI tools in Avalon?

@tokejepsen Correct. The UI would still use the slower version that is mostly noticeable with a very high amount of containers. The original optimization at the beginning of this PR however is there nonetheless. This validation is only another optimization.

The reason for removing the validation altogether currently is purely an optimization for heavy scenes. I guess we can have a little "vote" for removing it altogether? Is there anyone who can not live with the validation being removed from ls()? @mkolar @tokejepsen @davidlatwe

davidlatwe commented 5 years ago

Maybe revert the previous commit (f588483), merge, and open another issue ?

An issue to discuss whether to validate container data entries by default, or remove the validation entirely from parse_container. And the discussion result should be apply to all hosts in same manner.

BigRoy commented 5 years ago

Maybe revert the previous commit (f588483), merge, and open another issue ?

I'll just revert the change either way, because initial consensus seems to be opposing the idea of it anyway. ;) Then whether we use this PR or another for removing the validation altogether I don't really mind with either. Feel free to merge once I reverted it and open a clear issue for the "validations". :)

mottosso commented 5 years ago

I'll just revert the change either way

Thanks. It makes me very nervous that you're willing to make changes to the API so lightly. Any change to the API better have a darn good reason for happening. Anything else is for your own configs, not core.

;)

;)

davidlatwe commented 5 years ago

Merging this !

getavalon / core