swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.5k stars 10.35k forks source link

[SR-6455] Crash on load of the application when @available used for a class derived from a generic class #49005

Open swift-ci opened 6 years ago

swift-ci commented 6 years ago
Previous ID SR-6455
Radar None
Original Reporter grigorye (JIRA User)
Type Bug

Attachment: Download

Environment Any iOS that supports Swift 4, iOS 9, iOS 9.1, iOS 9.3, Xcode 9.1, Swift 4. Not bound to Debug/Release/optimization. Not reproducible when compiled with Xcode 9.2 beta 2.
Additional Detail from JIRA | | | |------------------|-----------------| |Votes | 0 | |Component/s | Compiler, Standard Library | |Labels | Bug | |Assignee | None | |Priority | Medium | md5: 6b6a56e44bd0077ea34abd59447f1876

Issue Description:

import MapKit

class X<C> : UITableViewController {
}

@available (iOS 9.3, *) // MKLocalSearchCompletion is available only on iOS 9.3
class Y: X<MKLocalSearchCompletion> {
}

Having the above code fragment compiled into (iOS) application (with deployment target = 9.0), results in 100% crash on launch of the application under iOS versions \< 9.3. It doesn't crash if you replace UITableViewController with NSObject. It crashes in any simulator and on all pre 9.3 devices that I have access to, as well. It works as expected under iOS 9.3 and later.

Attached is a sample project reproducing the problem (created from default master-detail template, modified just a portion of AppDelegate.swift (the very beginning) to include the above code fragment.

It's not bound to any particular iOS version but to usage of @available with the class derived from generic class that uses type that is available only under OS version specified in @avaiable: if for specialization of the base class you use type that is available on all iOS versions (like in "X\<UIView>"), it works as expected. But @available is used exactly because we use that very type, specific to the new iOS versions, for specialization.

For example, you can reproduce it on iOS 9.3 if you use "@available(iOS 10, *)" like in the following code fragment:

import Photos

class X<C> : UITableViewController {
}

@available (iOS 10, *) // PHLivePhotoEditingContext is avaiable only on iOS 10
class Y: X<PHLivePhotoEditingContext> {
}

So it's clearly a bug in the code generation/stdlib, not in iOS.

The stack trace is below:

(lldb) bt
* thread #&#8203;1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #&#8203;0: 0x0290d482 libswiftCore.dylib`swift::_swift_buildDemanglingForMetadata(swift::TargetMetadata<swift::InProcess> const*, swift::Demangle::Demangler&) + 18
    frame #&#8203;1: 0x0290dcee libswiftCore.dylib`swift::_swift_buildDemanglingForMetadata(swift::TargetMetadata<swift::InProcess> const*, swift::Demangle::Demangler&) + 2174
    frame #&#8203;2: 0x02915a5e libswiftCore.dylib`swift_initClassMetadata_UniversalStrategy + 158
    frame #&#8203;3: 0x000a7bf1 Pre93`___lldb_unnamed_symbol6$$Pre93 + 145
    frame #&#8203;4: 0x029147b0 libswiftCore.dylib`swift_getGenericMetadata + 1008
    frame #&#8203;5: 0x000a59f7 Pre93`type metadata accessor for X at AppDelegate.swift:0
    frame #&#8203;6: 0x000a7d5b Pre93`type metadata accessor for X<MKLocalSearchCompletion> at AppDelegate.swift:0
    frame #&#8203;7: 0x000a7ce5 Pre93`___lldb_unnamed_symbol8$$Pre93 + 21
    frame #&#8203;8: 0x03dff9cd libdispatch.dylib`_dispatch_client_callout + 14
    frame #&#8203;9: 0x03de9280 libdispatch.dylib`dispatch_once_f + 157
    frame #&#8203;10: 0x02920c62 libswiftCore.dylib`swift_once + 34
    frame #&#8203;11: 0x000a5fbc Pre93`type metadata accessor for Y at AppDelegate.swift:0
    frame #&#8203;12: 0x000a7f2b Pre93`___lldb_unnamed_symbol10$$Pre93 + 11
    frame #&#8203;13: 0x0014a91f dyld_sim`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 291
    frame #&#8203;14: 0x0014aa78 dyld_sim`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 64
    frame #&#8203;15: 0x00146a21 dyld_sim`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 335
    frame #&#8203;16: 0x00145d74 dyld_sim`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 104
    frame #&#8203;17: 0x00145e03 dyld_sim`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 79
    frame #&#8203;18: 0x0013bca8 dyld_sim`dyld::initializeMainExecutable() + 208
    frame #&#8203;19: 0x0013f2ef dyld_sim`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 3586
    frame #&#8203;20: 0x0013b221 dyld_sim`start_sim + 125
    frame #&#8203;21: 0x000c86d6 dyld`dyld::useSimulatorDyld(int, macho_header const*, char const*, int, char const**, char const**, char const**, unsigned long*, unsigned long*) + 2336
    frame #&#8203;22: 0x000c69c7 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 247
    frame #&#8203;23: 0x000c2200 dyld`dyldbootstrap::start(macho_header const*, int, char const**, long, macho_header const*, unsigned long*) + 383
    frame #&#8203;24: 0x000c2047 dyld`_dyld_start + 71
(lldb)
dukhnyak commented 10 months ago

I have very similar crash, always reproducible with recent Xcode 15.1, it happens with my application once I try to instrument it with xctrace:

xctrace record --template 'Allocations' --attach 24031

Please see stacktrace below, please let me know if I can provide any other useful information.

(lldb) bt
* thread #108, stop reason = ESR_EC_DABORT_EL0 (fault address: 0x10)
  * frame #0: 0x0000000192d186f8 libswiftCore.dylib`_swift_buildDemanglingForMetadata + 436
    frame #1: 0x0000000192d18a08 libswiftCore.dylib`_swift_buildDemanglingForMetadata + 1220
    frame #2: 0x0000000192d18a08 libswiftCore.dylib`_swift_buildDemanglingForMetadata + 1220
    frame #3: 0x0000000192d44f74 libswiftCore.dylib`copyGenericClassObjCName(swift::TargetClassMetadata<swift::InProcess, swift::TargetAnyClassMetadataObjCInterop<swift::InProcess>>*) + 236
    frame #4: 0x0000000182d3ec00 libobjc.A.dylib`objc_class::installMangledNameForLazilyNamedClass() + 84
    frame #5: 0x0000000182d303e8 libobjc.A.dylib`objc_class::demangledName(bool) + 100
    frame #6: 0x000000010d1c2854 liboainject.dylib`___lldb_unnamed_symbol146 + 48
    frame #7: 0x0000000182d33ec0 libobjc.A.dylib`_objc_addWillInitializeClassFunc + 388
    frame #8: 0x000000010d1c27ec liboainject.dylib`___lldb_unnamed_symbol145 + 84
    frame #9: 0x0000000182f34910 libdispatch.dylib`_dispatch_client_callout + 20
    frame #10: 0x0000000182f3614c libdispatch.dylib`_dispatch_once_callout + 32
    frame #11: 0x000000010d1c0d68 liboainject.dylib`___lldb_unnamed_symbol121 + 176
    frame #12: 0x000000010d1c0818 liboainject.dylib`___lldb_unnamed_symbol119 + 188
    frame #13: 0x0000000182f34910 libdispatch.dylib`_dispatch_client_callout + 20
    frame #14: 0x0000000182f3614c libdispatch.dylib`_dispatch_once_callout + 32
    frame #15: 0x000000010d1c0758 liboainject.dylib`___lldb_unnamed_symbol118 + 336
    frame #16: 0x000000010d1bf274 liboainject.dylib`___lldb_unnamed_symbol107 + 444
    frame #17: 0x000000010d1bf538 liboainject.dylib`_OAAttachAndInitialize + 240
    frame #18: 0x000000010d1abf24
    frame #19: 0x00000001830e6034 libsystem_pthread.dylib`_pthread_start + 136
(lldb) 
mikeash commented 9 months ago

That looks like an unavailable class is incorrectly being used, but it doesn't quite tell us what it is or why it's used. Something eventually trips over a NULL pointer, which is probably trying to point to something not present.

Are you able to provide a full project, or a built app, that reproduces this crash? If not, you may be able to get some more info about the troublesome type by putting a breakpoint on _swift_buildDemanglingForMetadata. The metadata it's working on will be in x0, and you should be able to use a debugger command like image lookup -va $x0 to get some information about that address, if it's not a dynamically allocated metadata. The last call or two before it crashes should be something pretty close to the problem.

dukhnyak commented 9 months ago

Many thanks for reply, can’t really provide full project and have tried but failed to get minimised reproducible test. I will try your suggestion with breakpoint and come back shortly !!

dukhnyak commented 9 months ago

Thank you again, unfortunately it seems I can’t get information about address at least for 4 calls before crash.

Process 35657 stopped
* thread #103, stop reason = breakpoint 1.1
    frame #0: 0x000000019d208544 libswiftCore.dylib`_swift_buildDemanglingForMetadata
libswiftCore.dylib`:
->  0x19d208544 <+0>:  pacibsp 
    0x19d208548 <+4>:  stp    d9, d8, [sp, #-0x70]!
    0x19d20854c <+8>:  stp    x28, x27, [sp, #0x10]
    0x19d208550 <+12>: stp    x26, x25, [sp, #0x20]
Target 0: (my-process) stopped.
(lldb) image lookup -va $x0
(lldb) register read x0
      x0 = 0x00000001400f1538
(lldb) cont
Process 35657 resuming
Process 35657 stopped
* thread #103, stop reason = breakpoint 1.1
    frame #0: 0x000000019d208544 libswiftCore.dylib`_swift_buildDemanglingForMetadata
libswiftCore.dylib`:
->  0x19d208544 <+0>:  pacibsp 
    0x19d208548 <+4>:  stp    d9, d8, [sp, #-0x70]!
    0x19d20854c <+8>:  stp    x28, x27, [sp, #0x10]
    0x19d208550 <+12>: stp    x26, x25, [sp, #0x20]
Target 0: (my-process) stopped.
(lldb) image lookup -va $x0
(lldb) cont
Process 35657 resuming
Process 35657 stopped
* thread #103, stop reason = breakpoint 1.1
    frame #0: 0x000000019d208544 libswiftCore.dylib`_swift_buildDemanglingForMetadata
libswiftCore.dylib`:
->  0x19d208544 <+0>:  pacibsp 
    0x19d208548 <+4>:  stp    d9, d8, [sp, #-0x70]!
    0x19d20854c <+8>:  stp    x28, x27, [sp, #0x10]
    0x19d208550 <+12>: stp    x26, x25, [sp, #0x20]
Target 0: (my-process) stopped.
(lldb) image lookup -va $x0
(lldb) cont
Process 35657 resuming
Process 35657 stopped
* thread #103, stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
    frame #0: 0x000000019d2086f8 libswiftCore.dylib`_swift_buildDemanglingForMetadata + 436
libswiftCore.dylib`:
->  0x19d2086f8 <+436>: ldrh   w8, [x0, #0x10]
    0x19d2086fc <+440>: cmp    w8, #0xdd
    0x19d208700 <+444>: b.ne   0x19d2085dc               ; <+152>
    0x19d208704 <+448>: mov    x0, x20
Target 0: (my-process) stopped.
(lldb) image lookup -va $x0
(lldb)

Before last 4 calls I have seen quite a lot of proper information about address, like

(lldb) image lookup -va $x0
      Address: my-process[0x0000000100f92d90] (my-process.__DATA_CONST.__const + 392360)
      Summary: my-process`type metadata for NIOCore.NIODeadline
       Module: file = "/opt/homebrew/var/nomad/alloc/b30c3991-204d-46df-713a-fd93eb4d640e/my-process LOCAL/local/my-process", arch = "arm64"
       Symbol: id = {0x000461d1}, range = [0x0000000101e26d90-0x0000000101e26da8), name="type metadata for NIOCore.NIODeadline", mangled="$s7NIOCore11NIODeadlineVN"

and sometimes no information , so similar to last 4 calls. I assume it means "dynamically allocated metadata” that can come from my own shared libraries.

Btw if I don’t load/use one of my shared library I can’t reproduce this crash. Ironically I really need this shared library as I wants to instrument it...

Also it CPU instrumentation is working just fine.

May I ask for additional advice what to look into and what information I can provide ? Or it is only minimised test that can help here ?

mikeash commented 9 months ago

If image lookup -va doesn't give any information, it's because the pointer isn't within any loaded dylib. Most likely the metadata is allocated on the heap. Normally I'd suggest making a runtime call to get the type name, but that's what we're crashing in, so that's no good.

What we can do is get the type descriptor pointer, which will be in a dylib and will hopefully have a symbol we can read. The position of the descriptor within the metadata varies depending on what kind of metadata it is, but what we can do is dump a bunch of memory and then look for a descriptor in the output. You can dump ten pointers' worth of memory, interpreted as addresses, like so:

(lldb) x/10a $x0

In the output, lldb will tell you if any of the listed addresses correspond to a symbol. It looks something like this:

0x2461dfff0: 0x000000023aa27d04 DylibNameHere`$sSwiftMangledSymbolName

The mangled name of a nominal type descriptor will end in Mn so look for one like that. If you see one, you can run it through swift demangle in the terminal to get a human readable name for it. If this is a generic type then you'll be missing the generic type parameters, but it will at least tell you what the main generic type is, which might give you a lead.

dukhnyak commented 9 months ago

Wow, I think we caught something with it, at least it is the module that I have issue with. I assume it is the last one before crash, and seems it has been demangled already.

Target 0: (ordo-system-core) stopped.
(lldb) image lookup -va $x0
(lldb) x/10a $x0
0x10981c310: 0x0000000000000301
0x10981c318: 0x0000000000000002
0x10981c320: 0x0000000000000000
0x10981c328: 0x000000010981c280
0x10981c330: 0x0000000000000000
0x10981c338: 0x00000001e2516c58 libswiftCore.dylib`type metadata for Swift.UInt64
0x10981c340: 0x0000000000000028
0x10981c348: 0x0000000000000004
0x10981c350: 0x0000000000000001
0x10981c358: 0x0000000000000000
(lldb) cont
Process 4148 resuming
Process 4148 stopped
* thread #104, stop reason = breakpoint 1.1
    frame #0: 0x0000000199ee4544 libswiftCore.dylib`_swift_buildDemanglingForMetadata
libswiftCore.dylib`:
->  0x199ee4544 <+0>:  pacibsp 
    0x199ee4548 <+4>:  stp    d9, d8, [sp, #-0x70]!
    0x199ee454c <+8>:  stp    x28, x27, [sp, #0x10]
    0x199ee4550 <+12>: stp    x26, x25, [sp, #0x20]
Target 0: (ordo-system-core) stopped.
(lldb) image lookup -va $x0
(lldb) x/10a $x0
0x10981c280: 0x0000000000000307
0x10981c288: 0x000000010394b104 ordo-system-core`uniquable existential shape for any OrdoPublic.AugmentationUpdate<Self.Identifier == τ_0_0, Self.Update == τ_0_1> any <null node pointer> + 4
0x10981c290: 0x00000001e2516c58 libswiftCore.dylib`type metadata for Swift.UInt64
0x10981c298: 0x0000000103c71328 ordo-system-core`type metadata for OrdoPublic.TransactionUpdate
0x10981c2a0: 0x0000000000000004
0x10981c2a8: 0x0000000000000000
0x10981c2b0: 0x0000000199f190a4 libswiftCore.dylib`swift::OpaqueValue* tuple_initializeBufferWithCopyOfBuffer<false, false>(swift::TargetValueBuffer<swift::InProcess>*, swift::TargetValueBuffer<swift::InProcess>*, swift::TargetMetadata<swift::InProcess> const*)
0x10981c2b8: 0x0000000199f1910c libswiftCore.dylib`void tuple_destroy<false, false>(swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*)
0x10981c2c0: 0x0000000199f191a0 libswiftCore.dylib`swift::OpaqueValue* tuple_initializeWithCopy<false, false>(swift::OpaqueValue*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*)
0x10981c2c8: 0x0000000199f19250 libswiftCore.dylib`swift::OpaqueValue* tuple_assignWithCopy<false, false>(swift::OpaqueValue*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*)
(lldb) cont
Process 4148 resuming
Process 4148 stopped
* thread #104, stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
    frame #0: 0x0000000199ee46f8 libswiftCore.dylib`_swift_buildDemanglingForMetadata + 436
libswiftCore.dylib`:
->  0x199ee46f8 <+436>: ldrh   w8, [x0, #0x10]
    0x199ee46fc <+440>: cmp    w8, #0xdd
    0x199ee4700 <+444>: b.ne   0x199ee45dc               ; <+152>
    0x199ee4704 <+448>: mov    x0, x20
Target 0: (ordo-system-core) stopped.

Am I right that suspicious one is OrdoPublic.AugmentationUpdate protocol ? And can it be due to following warning below that appears in STDERR when we load dynamic library ?

objc[4148]: Class OrdoPublic.TransactionAugmentationPluginFactory is implemented in both /opt/homebrew/var/nomad/alloc/7a70f04a-7882-f6e4-52f0-12baeb74ba1c/ordo-system-core LOCAL/local/ordo-system-core (0x103cdb8b0) and /opt/homebrew/var/nomad/alloc/7a70f04a-7882-f6e4

Thank you so much for help!

mikeash commented 9 months ago

Nice! That warning could definitely be connected. Stuff gets very confused when there are two copies of a class with the same name, since some things look up types by name, some things use pointers, and they expect them to agree. One specific issue that can happen is that you can end up with the copy of a type from one module, but the copy of a protocol from another, and then protocol conformance checks fail because the type conforms to the protocol in its module, not the other one. If you have a generic type with a protocol conformance requirement on the types, then you'll end up failing the type lookup, resulting in a NULL and a crash.

You can try setting the environment variable SWIFT_DEBUG_FAILED_TYPE_LOOKUP=YES and it will print some information about the failure if that's actually happening. Either way, definitely address that warning and get it so you only have one copy of all your classes (and protocols and anything else) loaded into your process, and hopefully that will take care of the issue.

dukhnyak commented 9 months ago

Thank you again, will definitely try SWIFT_DEBUG_FAILED_TYPE_LOOKUP=YES and let you know!

We are in process of addressing this warning, just taking a bit time too restructure code and builds as usual...

mikeash commented 9 months ago

I know how that goes, there's always so much to do. Definitely prioritize this warning, it's probably more troublesome than it sounds. I'd really like to turn it into a fatal error someday, but it's hard to do that without breaking the world.