warrenm / GLTFKit2

A glTF 2.0 asset loader and exporter for Objective-C and Swift.
MIT License
150 stars 26 forks source link

GLTFAssetReader convertBuffers crashes during asset loading #50

Open lenkawell opened 6 months ago

lenkawell commented 6 months ago

We're getting regular crashes from our app users. Unfortunately, because Apple supplies such minimal crash reports, we don't have a lot to go on, but the backtraces all look the same and point to line 430 in GLTFAssetReader. I've looked at the code and I'm not sure if it could use some additional validity checks or something like that. I can provide more .crash files if you'd like, but they all appear nearly identical.

Last Exception Backtrace: 0 CoreFoundation 0x1a68fc870 __exceptionPreprocess + 164 (NSException.m:249) 1 libobjc.A.dylib 0x19ec6fc00 objc_exception_throw + 60 (objc-exception.mm:356) 2 Foundation 0x1a582b374 NSAllocateMemoryPages + 184 (NSPlatform.m:1865) 3 Foundation 0x1a57ddddc -[_NSPlaceholderData initWithBytes:length:copy:deallocator:] + 132 (NSData.m:241) 4 Foundation 0x1a57de7f4 +[NSData(NSData) dataWithBytes:length:] + 40 (NSData.m:773) 5 GLTFKit2 0x1035a4230 -[GLTFAssetReader convertBuffers] + 144 (GLTFAssetReader.m:430) 6 GLTFKit2 0x1035a8f68 -[GLTFAssetReader convertAsset:] + 972 (GLTFAssetReader.m:1255) 7 GLTFKit2 0x1035a39e8 -[GLTFAssetReader syncLoadAssetWithURL:data:options:handler:] + 860 (GLTFAssetReader.m:412) 8 GLTFKit2 0x1035a3464 __52+[GLTFAssetReader loadAssetWithURL:options:handler:]_block_invoke + 48 (GLTFAssetReader.m:345) 9 libdispatch.dylib 0x1ae7cb6a8 _dispatch_call_block_and_release + 32 (init.c:1530) 10 libdispatch.dylib 0x1ae7cd300 _dispatch_client_callout + 20 (object.m:561) 11 libdispatch.dylib 0x1ae7d07b8 _dispatch_continuation_pop + 600 (queue.c:306) 12 libdispatch.dylib 0x1ae7cfdd4 _dispatch_async_redirect_invoke + 584 (queue.c:830) 13 libdispatch.dylib 0x1ae7debe4 _dispatch_root_queue_drain + 392 (queue.c:7051) 14 libdispatch.dylib 0x1ae7df3ec _dispatch_worker_thread2 + 156 (queue.c:7119) 15 libsystem_pthread.dylib 0x210201928 _pthread_wqthread + 228 (pthread.c:2642) 16 libsystem_pthread.dylib 0x210201a04 start_wqthread + 8 (:-1)

2024-01-1121-10-05.6334-0600-ec7d7cd927f2a400a6eb081bd78ccd3246c0505b.txt

warrenm commented 6 months ago

Strange. I would expect most assets to pass through that code path at least once, and obviously it works most of the time. We handle the case where the buffer's data pointer is NULL, and lengths of zero are legal.

The docs say that NSAllocateMemoryPages throws if the allocation fails, but that shouldn't happen for sane buffer lengths (say, up to several hundred MiB on most devices). We might be hitting a case of heap corruption or some other situation that is causing us to spuriously ask for a preposterous amount of memory, but that's just a guess.

Not much to go on here until we have a pathological asset in hand, sorry.

lenkawell commented 6 months ago

Random but fun historical note found during further debugging: the Mach vm kernel code is almost 40 years old. (I see that some of my old VAX/VMS kernel code from 1979 is still kicking around also. Yikes!)

lenkawell commented 5 months ago

Unfortunately we're seeing these crashes every day from users of our app and there doesn't seem to be any way to find out what model is being loaded. It's possible that the app is running out of memory but I don't see any info in the Apple crashes to indicate that's actually the case (didn't iOS crashes used to show memory used?).

So, how about this suggestion: add a @try/@catch around most of the code in GLTFAsssetRead.convertAsset and return NO if an exception occurs. Our app code will handle the load failure and show a placeholder model instead of the real thing, but at least the app shouldn't crash. I tried this out and it seems to at least not break anything that I can see.

I'll also mention that our app has quite a bit of low-memory handling user-visible messages appear, etc. So if the user is actually running out of memory, they should see it if the app doesn't crash before the message is displayed. We also present info about how to resolve the problem (e.g., reduce scene complexity, etc.) but if the app crashes first they may not get to read it.

Thanks!

warrenm commented 5 months ago

Propagating the exception in response to a catastrophic failure to allocate is the correct behavior on the part of the framework. If you want to wrap your calls into the framework with a try/catch block, I think that will achieve the same effect as adding it farther down the call stack, no?

lenkawell commented 5 months ago

Yes, I can add my own app-specific @try/@catch handler but will it work with the asynchronous GLTFAsset.load method? I'm no expert on exception handling and threading, so I'm not sure if my exception handler will be invoked in this instance.

I noticed that GLTFRealityKitLoader.convert has a somewhat similar do/catch for Swift exceptions, though it invokes fatalError when it catches an exception.

lenkawell commented 5 months ago

I just added an @try/@catch around my GLTFAsset.load method and then added an @throw to GLTFAsssetRead.convertAsset and my @catch block was not invoked. I presume it's due to the dispatch_async in loadAssetWithURL. Is there an alternative approach you can suggest? It appears that the synchronous GLTFAsset.assetWithURL is actually async with semaphores, so I don't think that'll do the job either.

warrenm commented 5 months ago

I'm persuaded. It feels icky to add a catch-all exception handler when what's going here is likely due to an issue that can be handled more elegantly at a lower level, but although we ordinarily consider allocation failure a unrecoverable situation, catching the exception and propagating an error seems manifestly better than crashing outright.

This is now implemented in 395eca2 and will be included in an upcoming release. Any feedback more than welcome.

lenkawell commented 5 months ago

Thank you very much! I've put your changes into our latest app build which is due for release soon. So far in testing, it seems to be working fine. I'll let you know when we have some real-world data from our users.