PixarAnimationStudios / OpenUSD

Universal Scene Description
http://www.openusd.org
Other
6.06k stars 1.2k forks source link

UsdUtilsStitchLayers extremly slow when stitching large collections. #1998

Open LucaScheller opened 2 years ago

LucaScheller commented 2 years ago

Hey, we (at RiseFX) have run into the following issue a few times lately:

When collections have a large member count (it starts to be noticeable with collections that have > 2000 members), then the usdstitch function/cmdline tool becomes quite slow (we're talking over 100x slower or even more depending on the collection size, plus the problems adds up quickly when we have multiple collections).

If you execute the following:

usdstitch -o StitchingOutput.usd StitchingInput.usd

It takes over 30 seconds. The example file has 20 000 prims with a collection that contains all of them.

Would be great if someone could look into this to see why it is taking so long and if it could be optimized. Cheers, Luca StitchingInput.zip

sunyab commented 2 years ago

Filed as internal issue #USD-7575

spiffmon commented 2 years ago

Hi @LucaScheller , In general, that process would take some real work to optimize, basically end-running around the low-level API's that know how to do the "merging" of targets robustly. There is a specific case we could detect and do better for, though: does every one (or at least a longish sequential prefix of) of the clips have the same collection(s)? As we merge layers together, we could check to see if the result and next clip's target ops are identical, and if they are, skip trying to merge them, which is the big expensive part.

Also, big, fully explicit collections is, in general, not the way to go, if it can be avoided. If you can instead leverage the include/exclude features to require fewer paths being authored, I think you'll get better performance in multiple places. We provide UsdUtilsComputeCollectionIncludesAndExcludes (and UsdUtilsAuthorCollection) to help you generate more "optimal" collections from big lists of explicit targets on a stage.

Cheers, --spiff

LucaScheller commented 2 years ago

Hi @spiffmon, thanks for your explanation and sorry for this late response.

The clips 99% of the time have the same collections yes, so the "is the next clip's target ops identical" check would probably fix it most of the time.

Thanks for the tips about how to reduce the collection sizes, we've started limiting/enforcing artists to reduce the size, which seems to have reduced the occurrences of the issue.

A question about collection/relationships: The issue also seems to happen with normal relationship properties. We were wondering what exactly is being merged when stitching? Does it simply do a "combine unique values of list a + list b" or does it do a more complex path checking? With python doing a set().union is quite fast, so I'm guessing there are more complicated Sdf.Path checks happening under the hood? If the "is the next clip's target ops identical" would be implemented, would it then only check for identical relationship for each :includes/:excludes relationship separately or would it still perform the check against the "resolved" collection?

Have a great week!, Luca

spiffmon commented 2 years ago

Hi @LucaScheller , The "collection merging" is not doing anything more sophisticated than merging the two relationships, layer-wise. But the work that goes into merging two listOps is non-trivial... which is not to say it couldn't be improved, but it hasn't been a hot-spot in our pipeline as we tend to run into other dominant problems when we have really populous relationships.

To your closing question, yes, the "is identical" check would be testing the "merged up to this point" ListOp to that of the next layer being merged, which possibly limits its usefulness as an optimization.

sunyab commented 1 year ago

Just happened to poke at this today -- there seems to be some pathological behavior in the crate file format's processing of relationship targets. If I convert the StitchingInput.usd file to a .usda file, usdstitch completes in ~1 second vs. 25 seconds with the original .usd file.

So as a temporary workaround, you could convert .usd files to .usda, run usdstitch, and then convert back to .usd.