getavalon / core

The safe post-production pipeline - https://getavalon.github.io/2.0
MIT License
218 stars 49 forks source link

Optimisation: Improve host ls() speed by removing schema validation #457

Closed BigRoy closed 4 years ago

BigRoy commented 5 years ago

Issue

In #456 the performance checks brought forward that on very large scenes with many containers the schema validation of the container can degrade overall performance to an unwanted level for host.ls(). Another performance measurement can be found at the bottom of this comment.

The comparisons:

Scene with roughly 24 containers:
- 0.309000015259 seconds without validation
- 0.477999925613 seconds with validation

Scene with 1279 container (by @davidlatwe) <- this is solely `parse_container()`
- 2.621 seconds with validation
- 0.155 seconds without validation

Note how for many containers the speed difference is very noticeable.

This was tested with host Maya, but it's the same schema.validate check in other hosts.

Discussion

The schema.validate(container) is a check to ensure the container behaves as expected for a container as it is generated. Without it nothing ensures that whatever the function returns actually behaves exactly like a container.

Should we remove the validation altogether?

One could argue that Python tests should ensure the API is stable to return correct containers (e.g. tests that run on example scenes with the validation) as opposed to always needing to run this on every container on host.ls() in production. That way we can still ensure the API always ends up working as expected. This would then result in the same expected safety yet without the speed degradation in production.


I'm opening this issue because of some comments were made in #456 about whether to allow toggling the validation through an argument to host.ls() or whether to remove it alltogether so that UIs would also allow the speed improvement.

tokejepsen commented 5 years ago

Personally opening the scene manager was not been a problem for us, but we don't deal with complicated scenes atm, so I can't vote for this. If it's a problem for people with complicated scenes, then I would say go for it.

davidlatwe commented 5 years ago

I propose we add argument validate_container to ls and set default to False. So it would be :

def ls(validate_container=False):
    ...
    for container in sorted(container_names):
        data = parse_container(container, validate=validate_container)
        ...

Why ?

It's a really, really rare case that any container's required attribute/entry would be changed/deleted by accident.

So I think we should take the leap, since we now know the benefit from skipping schema validation on listing containers is way more than protecting them from all thumbs when you have lots of containers on hand and need to be parsed.

Besides, I think the schema validation is only a post check instead of a protection in this case.

tokejepsen commented 5 years ago

Why leave the validation option in there for rare cases? Would it not be better to provide the validation on its own instead?

BigRoy commented 5 years ago

Why leave the validation option in there for rare cases? Would it not be better to provide the validation on its own instead?

I agree. If we're moving that direction I'd vote to remove it alltogether from ls and just allow anyone to do the following:

from avalon import api, schema

host = api.registered_host()
for container in host.ls():
    schema.validate(container)

Because just validating the container is as trivial as schema.validate(container).

davidlatwe commented 5 years ago

I had a think and yes, that's better and keeping things simpler. :relaxed:

A little note for future change log : https://getavalon.github.io/core/#avalon.schema.validate