Optimize inspect(), weld(), and join() for scenes with many primitives sharing vertex streams

donmccurdy commented 8 months ago

For models reusing the same vertex attributes heavily, parsing and welding are slower than they ideally would be. The attached sample, from a recent three.js discussion, has 10,000 mesh primitives, all sharing the same vertex position and normal attributes, but with different indices. It's not an optimized file – the primitives should be merged — but still, I think that glTF Transform can and should process the file more quickly than it currently does.

When running a weld, I think the issue is that each primitive is handled independently. With inspect, I'm not sure where the cost lies.

oval.gltf.zip

donmccurdy commented 8 months ago

Essentially the same issue coming up for join() in #1317, which I'll merge into this issue. The functions need to either be smarter about working only with the specific vertices indexed by a given primitive (not its entire vertex stream, possibly shared), or the functions need to be refactored to operate on some concept of a vertex stream. I suspect the first is more maintainable.

Another example file, available at:

https://www.dropbox.com/scl/fi/q1uisxtqoe7hj4d5gzm1r/world.zip?rlkey=1gt9q5t4ns9v8xwwx8gy6geo2&dl=0

donmccurdy commented 6 months ago

Concluding notes —

dequantize ✅

OK to iterate vertex attributes, because the attributes are processed only once, never cloned.

reorder ✅

Calls remapAttribute, iterates over the source attribute. Appears not to write any more than necessary, either way. Improved by switching from remapAttribute to compactAttribute. Further improved by reducing excess iteration over primitives sharing the same vertex stream.

https://github.com/donmccurdy/glTF-Transform/pull/1358

unweld ✅

Did a clone on the vertex attributes before overwriting the array with setElement. Should be a shallow clone, and we can make the loop a bit tighter. Fixed both.

https://github.com/donmccurdy/glTF-Transform/pull/1344

weld ✅

Use of remapPrimitive+remapAttribute resulted in iteration over the entire vertex attribute without need, even if only a subset was copied. Replaced with compactPrimitive / compactAttribute. Vastly faster on oval.gltf.

simplify ✅

Calls dequantizeAttributeArray unoptimally, and passes the entire vertex buffer into simplification when it can only really use the indexed positions.

https://github.com/donmccurdy/glTF-Transform/pull/1381

quantize ✅

Clones and quantizes each attribute in isolation, without regard for indices. Then dedups to clean up at the end. Fixed, mostly with compactPrimitive.

oval.gltf:

before: 156.85s
after: 8.74s

join ✅

Compact primitives to isolate vertex streams and remove unused vertices on a per-primitive level. Then deep cloning primitives and transformPrimitive are cheaper. Then in joinPrimitives, fix call to remapAttribute to reduce allocations.

Remove skipIndices argument to transformPrimitive, now that we can guarantee vertex streams are isolated.

oval.gltf + join:

before: 74.39s
after: 2.34s (remaining time is I/O-constrained)

lovecraftian.glb + opt, join step only:

before: 219ms
after: 115ms

donmccurdy / glTF-Transform