donmccurdy / glTF-Transform

glTF 2.0 SDK for JavaScript and TypeScript, on Web and Node.js.
https://gltf-transform.dev
MIT License
1.3k stars 145 forks source link

`partition` command is slow for large models #1372

Closed mixtur closed 2 months ago

mixtur commented 2 months ago

Describe the bug When applied to a model with a lot of meshes and nodes partition command takes forever to complete.

To Reproduce Steps to reproduce the behavior:

  1. Run gen-model.js script in node.js gen-model.zip (It will generate the model)
  2. Run node --max-old-space-size=4096 <path/to/gltf-transform/cli> partition model.gltf partitioned/model.gltf --meshes
  3. Pretty much nothing seems to happen in 10 minutes or so.

Without --max-old-space-size=4096 node.js dies.

Expected behavior Reasonable execution time. And hopefully no need to use --max-old-space-size option

Versions:

Additional context You can experiment with MESHES_COUNT constant in the script. The original value 31322 is what we have in our actual model. Here are some additional problems I found:

donmccurdy commented 2 months ago

PRs:

mixtur commented 2 months ago

Tried the alpha. Now it works faster indeed. Thank you a lot!

It still requires --max-old-space-size, though. I hope it can also be fixed eventually.

donmccurdy commented 2 months ago

On my laptop (Apple M1, 2021) I was able to partition an input scene with 60K meshes without issues, raising it to 120K meshes I do eventually hit what I assume is the same OOM error...

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

... but if your stack trace looks the same as mine, then the OOM occurs after the glTF file has been generated in memory. So it's not the processing that hits the OOM here, but trying to write 120K files to disk simultaneously. Seems like the NodeIO class should do some batching, but a workaround in the meantime might be to use io.writeJSON(document) instead, then write all the bin files in the resources list gradually instead of all at once.

Batches would be best, but something like this should get the job done:

const {json, resources} = await io.writeJSON(ndocument);

await writeFile('scene.gltf', JSON.stringify(json));

for (const uri in resources) {
  await writeFile(uri, Buffer.from(resources[uri]));
}

That said ... do you actually want 31K files on disk? Or is #1362 still what you really need here?

donmccurdy commented 2 months ago

In any case! This appears to solve the issue, tested up to 120K meshes.

mixtur commented 2 months ago

https://github.com/donmccurdy/glTF-Transform/issues/1362 - is indeed what we really need.

31K files may actually be a problem for some people. For example if you'll try to open a folder with that many entities, in windows explorer, you may as well go and make yourself a coffee while it opens. Or for another example if you serve that folder in publicPath in webpack on Linux it can easily run out of fs watchers.

But for us, since it is just a short-lived temporary artifact, this is kinda fine.

I don't atually have a meaningful JS stack trace when it fails.

I am running version 4.0.0-alpha.15 on a Linux laptop, 16GB RAM, Ryzen 9 7945HX, and for me the error without --max-old-space-size looks like this for generated file:

<--- Last few GCs --->

[9429:0x7281330]     8736 ms: Mark-Compact (reduce) 2047.2 (2082.8) -> 2046.6 (2081.3) MB, 346.19 / 0.00 ms  (+ 81.7 ms in 18 steps since start of marking, biggest step 5.3 ms, walltime since start of marking 438 ms) (average mu = 0.216, current mu = 0.14

<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

 1: 0xca5430 node::Abort() [node]
 2: 0xb7807d  [node]
 3: 0xeca0b0 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [node]
 4: 0xeca397 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [node]
 5: 0x10dc0e5  [node]
 6: 0x10f3f68 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
 7: 0x10ca081 v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
 8: 0x10cb215 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
 9: 0x10a8866 v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [node]
10: 0x15035f6 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [node]
11: 0x7718d7ed9ef6 
Aborted (core dumped)

And for our actual file like this:


<--- Last few GCs --->
art of marking 453 ms) (average mu = 0.277, current mu = 0.21[8653:0x713a560]    16411 ms: Mark-Compact (reduce) 2046.7 (2080.2) -> 2045.2 (2079.1) MB, 374.45 / 0.01 ms  (+ 4.1 ms in 1 steps since start of marking, biggest step 4.1 ms, walltime since start of marking 387 ms) (average mu = 0.227, current mu = 0.164)[8653:0x713a560]    16924 ms: Mark-Compact (reduce) 2045.2 (2079.1) -> 2045.2 (2079.1) MB, 513.68 / 0.00 ms  (average mu = 0.113, current mu = 0.000) allocation failure; GC in old space requested

<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

 1: 0xca5430 node::Abort() [node]
 2: 0xb7807d  [node]
 3: 0xeca0b0 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [node]
 4: 0xeca397 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [node]
 5: 0x10dc0e5  [node]
 6: 0x10dc674 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [node]
 7: 0x10f3564 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::internal::GarbageCollectionReason, char const*) [node]
 8: 0x10f3d7c v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
 9: 0x10f53f2 v8::internal::Heap::CollectAllAvailableGarbage(v8::internal::GarbageCollectionReason) [node]
10: 0x10cb23f v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
11: 0x10a7936 v8::internal::Factory::AllocateRaw(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [node]
12: 0x10990ac v8::internal::FactoryBase<v8::internal::Factory>::AllocateRawArray(int, v8::internal::AllocationType) [node]
13: 0x1099214 v8::internal::FactoryBase<v8::internal::Factory>::NewFixedArrayWithFiller(v8::internal::Handle<v8::internal::Map>, int, v8::internal::Handle<v8::internal::Oddball>, v8::internal::AllocationType) [node]
14: 0x13c717a v8::internal::OrderedHashTable<v8::internal::OrderedHashSet, 1>::Allocate(v8::internal::Isolate*, int, v8::internal::AllocationType) [node]
15: 0x13c7220 v8::internal::OrderedHashTable<v8::internal::OrderedHashSet, 1>::Rehash(v8::internal::Isolate*, v8::internal::Handle<v8::internal::OrderedHashSet>, int) [node]
16: 0x14f8743 v8::internal::Runtime_SetGrow(int, unsigned long*, v8::internal::Isolate*) [node]
17: 0x7ae347ed9ef6 
Aborted (core dumped)
mixtur commented 2 months ago

That said ... do you actually want 31K files on disk? Or is https://github.com/donmccurdy/glTF-Transform/issues/1362 still what you really need here?

To clarify https://github.com/donmccurdy/glTF-Transform/issues/1362 a little more. It is not like there will be a file per primitive in OUR format. Chunks in our format include multiple primitives. We just want to be able to decide which primitive goes to which chunk. And right now partition is the only way to achieve that.