swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.59k stars 10.36k forks source link

[SR-381] API to lookup a Type given a top-level class name #42998

Closed lhoward closed 8 years ago

lhoward commented 8 years ago
Previous ID SR-381
Radar None
Original Reporter @lhoward
Type New Feature
Status Closed
Resolution Done
Additional Detail from JIRA | | | |------------------|-----------------| |Votes | 1 | |Component/s | Standard Library | |Labels | New Feature, AffectsABI, Runtime | |Assignee | @jckarter | |Priority | Medium | md5: 13f1c2ae7174aab8983f1dde3a976c53

blocks:

relates to:

Issue Description:

This may be useful to implement NSStringFromClass()-compatible behaviour which is necessary in order to implement NSKeyedArchiver. I am using the attached workaround at the moment but obviously it is very fragile.

phausler commented 8 years ago

I think that the mapping of Foundation classes is totally reasonable provided that is the limitation of mapping. Asking user level code to be modified for that mapping is where I start to have reservations about compatibility/portability of code.

So far the pull request does not have Foundation specific info here besides that it is claimed as an SPI.

Per the commentary of `@objc` syntax; I think we can safely think of that as a separate issue and I tend to agree that if you buy into something that is not implemented on linux then it is reasonable to expect that you need to make affordances for that compatibility. (so 100% in agreement with you there)

So I think we are roughly on the same page? That my example is something we need to match and this is a good start to it?

lhoward commented 8 years ago

Consider a third-party application with classes that inherit from NSObject. On platforms using Corelibs Foundation, these will be native Swift classes. True, the Objective-C classes have been replaced by Swift classes, but is it congruent with Corelibs Foundation's portability objectives to require the developer to care?

lhoward commented 8 years ago

Anyway, I'm not entirely clear where I should go from here. Obviously (even if it pains me a little!) putting a static mapping into Foundation is easy.

But what to do about third-party consumers of the Foundation archiving API? Do we say they're completely unsupported (for now)? Do we support it only with mangled names and if so, will stdlib take an API for looking up a type by mangled name (and converse)?

belkadan commented 8 years ago

I still don't understand the confusion. If class Foo is a subclass of a Foundation class, that doesn't affect how Foo's name is encoded at all, only what classes are used as potential fallbacks. (And it's not even really correct to use every superclass as a fallback.)

The way I see it, only the parts of the API that deal with names (*className*) have to worry about any of this anyway, and then yes, of course you have to use the mangled name, because that's how runtime names work in Swift. (That would almost be a reason not to special-case top-level classes, but I guess it's too late for that.)

belkadan commented 8 years ago

I guess that does mean that any decision we make about top-level non-generic non-private classes does have to be publicly documented.

lhoward commented 8 years ago

I am just going to rip out the dynamic lookup code from NSKeyedArchiver/NSKeyedUnarchiver, that seems like the simplest solution.

lhoward commented 8 years ago

Re: the superclass hierarchy has no bearing on the encoding of Foo's name, noted – I was confused because I (incorrectly) thought you were proposing on Linux that the mangled name was always used.

lhoward commented 8 years ago

I am still unclear as to

phausler commented 8 years ago

Here is an outline of what Foundation needs to do the correct thing:

// in module Foo

class Bar : NSObject {
    class Baz : NSObject { }
}

class AnotherClass { }

@objc(Squirrel)
class Белка: NSObject { }

print("\(NSStringFromClass(Bar.self))")
print("\(NSStringFromClass(Bar.Baz.self))")
print("\(NSStringFromClass(AnotherClass.self))")
print("\(NSStringFromClass(NSTask.self))")
print("\(NSStringFromClass(Белка.self))")

The reasonable case
NSStringFromClass(Bar.self) -> "Foo.Bar"

If we get this working; great, if not, we can deal with it later. We could even forbid this type for an initial implementation and it would still be a win in my book. Or we could claim the darwin behavior a bug. So this honestly is up in the air.
NSStringFromClass(Bar.Baz.self) -> "_TtCC3Foo3Bar3Baz"

Also relatively reasonable
NSStringFromClass(AnotherClass.self) -> "Foo.AnotherClass"

The specialized case for Foundation classes (part of the implementation of NSStringFromClass et al.
Foundation does this from inside of NSStringFromClass and NSClassFromString and NOT from the stdlib
NSStringFromClass(NSTask.self) -> "NSTask"

I don't think it is a goal to have archives encoded with Objective-C names be decodable in Swift by using the @objc syntax since this is not portable since there is no objc support on linux in swift.
NSStringFromClass(Белка.self) -> "Squirrel"

If we got these behaviors the NSStringFromClass and NSKeyedArchiver etc would behave exactly as they do with the Objective-C implementations. Since String(reflecting: T.self) will emit what we need to provide enough info the only other side is the class from a given name and only really in the cases of a class that adopts NSCoding (to solve the upper level issue).

So just to be clear of what I think should be done here in the runtime is we need a way to get a type given a fully qualified name:
_typeByName("Foo.Bar") -> Bar.self
_typeByName("Foo.AnotherClass") -> AnotherClass.self
_typeByName("NSTask") -> nil
_typeByName("Foundation.NSTask") -> NSTask.self
_typeByName("_TtCC3Foo3Bar3Baz") -> Bar.Baz.self // gravy if we can get it but not a needed item to make it work
_typeByName("Squirrel") -> Белка.self // Not needed in linux because there is no Белка class since the @objc won't compile there

lhoward commented 8 years ago

Right, which is what _typeByName() as submitted in the PR does (not for the gravy case; it could also search mangled names, but perhaps that's better in a separate API). And note is was marked as Foundation SPI as in "good enough until we have a better solution".

lhoward commented 8 years ago

@phausler I created the lhoward/nscoding-static branch with a Q&D static mapping, turning NSStringFromClass()/NSClassFromString() into a static lookup. I'd prefer a proper solution but I'd really like some guidance on what the interfaces between Foundation and stdlib should look like to avoid wasting effort.

lhoward commented 8 years ago

Also to be clear, I was never proposing the NSStringFromClass()/NSClassFromString() behaviour be embedded in stdlib (and perhaps this added to some of the confusion).

What I was asking is:

If the format is a mangled (_Tt) name, and you agree that Foundation should continue to encode top-level classes with the unmangled (Foo.Bar) name, how do you propose one lookup a Type given such an unmangled top-level name?

lhoward commented 8 years ago

Here is a proposal. A stdlib interface (marked Foundation SPI) that allows the following:

protocol SomeProtocol {}
class SomeClass {
    class Nested : SomeProtocol {}
}

let name = String(reflecting: SomeClass.Nested.self)
let persistentName = _persistentTypeName(SomeClass.Nested.self)

print("_typeName(SomeClass.Nested) == \(name)")
print("_persistentTypeName(SomeClass.Nested) == \(persistentName)")

if let type = _typeByName(name) {
    print("_typeByName(\(name)) == \(String(reflecting: type))")
}

if let type = _typeByPersistentName(persistentName) {
    print("_typeByPersistentName(\(persistentName)) == \(String(reflecting: type))")
}

which prints:

% ./typeName
_typeName(SomeClass.Nested) == typeName.SomeClass.Nested
_persistentTypeName(SomeClass.Nested) == _TtCC8typeName9SomeClass6Nested
_typeByName(typeName.SomeClass.Nested) == typeName.SomeClass.Nested
_typeByPersistentName(_TtCC8typeName9SomeClass6Nested) == typeName.SomeClass.Nested

This – that is, separate interfaces for both mangled and non-mangled type names – makes implementing the existing NSClassFromString()/NSStringFromClass() behaviour straightforward (see below). I'm trying to understand enough about the de/re-mangler to support generic classes.

If this interface isn't right, then I'd certainly welcome some ideas on how you see Foundation interfacing with the runtime in order to implement name to class (and class to name) lookup for archiving.

Edit: "canonical", "mangled", etc, alternatives to "persistent".

lhoward commented 8 years ago

And here is how NSStringFromClass/NSClassFromString might be implemented:

func NSStringFromClass(aClass: AnyClass) -> String {
    let aClassName = String(reflecting: aClass).bridge()
    let components = aClassName.componentsSeparatedByString(".")

    if components.count == 2 {
        if components[0] == _SwiftFoundationModuleName {
            return components[1]
        } else {
            return String(aClassName)
        }
    } else {
        return _canonicalTypeName(aClass)
    }
}

func NSClassFromString(aClassName: String) -> AnyClass? {
    var aClass : Any.Type? = nil

    if aClassName.hasPrefix("_Tt") {
        aClass = _typeByCanonicalName(aClassName)
    } else if aClassName.characters.indexOf(".") == nil {
        aClass = _typeByName(_SwiftFoundationModuleName + "." + aClassName)
    } else {
        aClass = _typeByName(aClassName)
    }

    return aClass as? AnyClass
}
phausler commented 8 years ago

Nice! That looks like exactly what we need; I wish we didn't have to have a differing constant for the module name of Foundation but that is a whole different ball of wax.

lhoward commented 8 years ago

Thanks @phausler. I've a patch on the way that will only support classes (at least for the name to type function), but I think it will be relatively straightforward (at least compared to _typeByName) to add support for generics if I can get some tips on the demangling API.

lhoward commented 8 years ago

I've updated the pull request here:

https://github.com/apple/swift/pull/834/files

In _metadataForMangledName(), I can see it should be possible to demangle the name into the generic and specialised components and funnel this into swift_getGenericMetadata(), but I haven't quite figured out how to put it all together yet. Tips welcome. 🙂

lhoward commented 8 years ago

https://github.com/lhoward/swift-corelibs-foundation/blob/lhoward/nscoding/TestFoundation/TestNSKeyedArchiver.swift

has an example of archiving nested classes, which now works.

lhoward commented 8 years ago

OK, I also have lookups of generic types working 🙂

_canonicalTypeName(GenericStruct<Nested, String>) == _TtGV8typeName13GenericStructCCS_9SomeClass6NestedSS_
_typeByCanonicalName(_TtGV8typeName13GenericStructCCS_9SomeClass6NestedSS_) == GenericStruct<Nested, String>

Spoiler alert: generic type metadata lookup does use dladdr() so it requires public (on Linux also relocatable) symbols (which arguably makes the whole effort a bit pointless as we already rejected the approach of using dlsym() generally). (If the generic metadata included a name this could be avoided.)

Doing _typeBy[Canonical]Name for all types is going to require some careful thought, but I'm of the "something is better than nothing" school of thought for getting Foundation archiving bootstrapped.

Adding a cache would be good, it's a pretty expensive lookup if there are a lot of conformance tables.

belkadan commented 8 years ago

Nested classes cannot be looked up by unmangled name, because both of these have the same unmangled name:

struct Outer {
  class Inner : NSObject, NSCoding { /*…*/ }
}
class Outer {
  class Inner : NSObject, NSCoding { /*…*/ }
}

It is very unlikely we can use fully-qualified names for anything but top-level non-private non-generic classes. We can attempt to continue jumping through hoops but they get progressively harder for much less benefit and more backwards-deployment concerns.


I'm pretty sure we'll want some kind of typeForName in the stdlib. I'm waffling on whether the "Foo.Bar" support belongs in the stdlib or not. We use it on Apple platforms for things like associated files, too (you can name your interface file "MyApp.MainController.xib" and then not have to manually specify the file name inside the controller class), so wanting a "pretty yet dynamically resolvable" name isn't limiting to archiving.


A thought: as far as registration goes, if we only supported secure archiving on Linux we'd be set, because you always have to pass the classes you'll allow for a particular decoding session.

lhoward commented 8 years ago

The current implementation of NSKeyedArchiver always encodes nested classes using the mangled name.

Edit: thanks for pointing out the issue with different containing types though. I hadn't thought of that.

phausler commented 8 years ago

per the caching; we could probably cache the accesses in Foundation for now if that makes it easier.

lhoward commented 8 years ago

Per my comment from a few days ago, guidance on what APIs do and do not belong in stdlib is helpful. Recall Foundation only needs to be able to look up a class given a name.

Edit: If it's undesirable to have stdlib expose an API that returns the first type matching a possibly non-unique name, then we can compose the canonical (mangled) class name in Foundation and look that up instead. Let me know.

Recall: Foundation needs to deal with unmangled top-level class names for archive interoperability with Darwin.

lhoward commented 8 years ago

Doing the caching in the stdlib implementation is fine, I just haven't done it yet.

lhoward commented 8 years ago

In NSStringFromClass(), what's the safest way to check if a class is a generic – is it safe to look for "\<" in the unmangled string representation or can that character be escaped?

lhoward commented 8 years ago

Not sure I understand the comment about registration. Are you suggesting we only need a function that provides the mangled name given a type? Or are you suggesting that Foundation only needs to support encoding, not decoding?

Providing classes for a decoding session only applies to secure coding and also does not require that the caller explicitly register a class name to class mapping.

Only providing encoding support runs counter to the goal of Foundation having feature parity on both platforms.

belkadan commented 8 years ago

I really think you're doing this backwards. Assume mangled names, pattern-match the one safe case, and turn that back into the fully-qualified name. That's what libobjc does (search for copySwiftV1DemangledName).

I don't remember how to get a mangled name for a type but I think that's in the stdlib somewhere. Or at least a bare runtime entry point.

jckarter commented 8 years ago

The runtime should handle the core type lookup logic in any case. My concerns with using mangled names is that mangled names are bad API for reflective purposes, since they're unreadable and nontrivial to compose. We're already careful about changing the format produced by String(reflecting:) because of compatibility concerns, and the format should already include a token to represent local scopes (and can be fixed to do so if it doesn't). You can compose by string interpolation fairly easily, which is important if you want "Module.Foo\<(typeB)>" or something like it. Substitutions (and soon, compression) make this composition expensive for mangled names. Maybe we can design some sort of abstract type grammar that the typeName/typeByName entry points trade in, but that has its own complications.

lhoward commented 8 years ago

In reply to @belkadan, there is _swift_buildDemanglingForMetadata() in stdlib which the PR for this bug report exposes as _canonicalTypeName().

OK, what you say makes sense. I'll remove _typeByName() and put the transformation for the one safe case into Foundation.

lhoward commented 8 years ago

@jckarter two issues with unmangled names:

lhoward commented 8 years ago

How about this then:

private func buildCanonicalNameForClass(aClassName: String) -> String? {
    var name : String

    if aClassName.hasPrefix("_Tt") {
        return aClassName
    }

    var components = aClassName.bridge().componentsSeparatedByString(".")
    if components.count == 1 {
        components = [ _SwiftFoundationModuleName, aClassName ]
    } else if components.count != 2 {
        return nil
    }

    if components[0].isEmpty || components[1].isEmpty {
        return nil
    }

    name = "_TtC"

    if components[0] == "Swift" {
        name += "Ss"
    } else {
        name += String(components[0].length) + components[0]
    }

    name += String(components[1].length) + components[1]

    return name
}

internal func NSClassFromString(aClassName: String) -> AnyClass? {
    guard let canonicalName = buildCanonicalNameForClass(aClassName) else {
        return nil
    }

    return _typeByCanonicalName(canonicalName) as? AnyClass
}
belkadan commented 8 years ago

> Providing classes for a decoding session only applies to secure coding

If we only supported secure coding for Linux (at least for now), this would be good enough to build a list of all possible class names you could encounter in the archive. I'm not saying this is an ideal solution, but it gets Foundation unblocked while we (Foundation and Swift runtime) design and figure out _typeForName—which, AFAIK, we do not have an implementation plan for yet. (Using dlsym is not something we want to ship with.)


@jckarter and I talked this morning for quite a while about mangled vs. demangled names, and didn't quite manage to convince each other either way. Things that came up:

lhoward commented 8 years ago

Just to address the first comment:

See https://github.com/apple/swift/pull/834/files

lhoward commented 8 years ago

Thanks for the analysis above, it all makes sense.

The only thing I would add is: libobjc/Foundation on Darwin uses a particular encoding of class names today (unmangled for top-level classes, mangled otherwise). Corelibs Foundation has an explicit goal of compatibility with Foundation on Darwin, as I understand it. So whilst I understand the desire for a long-term solution perhaps we need something that matches libobjc's behaviour in the interim, with whatever division of responsibility between stdlib and Corelibs Foundation deemed appropriate.

belkadan commented 8 years ago

Yep, sorry, hadn't kept up. Thanks for the new implementation!

> "All possible class names" is unbounded.

…aargh, I forgot that the root object does not itself have a set of expected classes.

lhoward commented 8 years ago

I added a cache but it uses StringMap which pulls in libLLVMSupport (and libswiftCore.dylib seems to be linked with -all_load). Not ideal: we should probably use a different implementation or somehow pull in StringMap.cpp directly.

jckarter commented 8 years ago

We shouldn't link the runtime against any LLVM libraries. Can you use DenseMap instead, or the ConcurrentMap that's implemented in the runtime already?

lhoward commented 8 years ago

Fixed to use ConcurrentMap.

lhoward commented 8 years ago

Until such time we have a stable and unique de-mangled name format:

public // SPI(Foundation)
func _topLevelClassByName(name: String) -> AnyClass?

Edit: we could just call this typeByName and return Any but I think that is misleading, even though that's the eventual API we want.

I have updated the nscoding branch of Foundation to use this. I am still cleaning up the pull request for stdlib with a view to requesting it to be integrated after the 2.2 branch.

lhoward commented 8 years ago

NSKeyedArchiver/NSKeyedUnarchiver Linux/OS X tests pass with this change.

lhoward commented 8 years ago

One more thing: in the interests of minimizing developer surprise, we should find a way for this API to work where the class has no direct protocol conformance. If emitting dummy conformance records for every type is problematic, we could special case NSCoding. (In the Swift 3 timeframe perhaps a different approach to runtime metadata will exist, but for the interim.)

Foundation is using an explicit dummy protocol for codable classes that inherit NSCoding, but expecting developers to know the limitations of NSClassFromString might be unreasonable.

jckarter commented 8 years ago

We can add that later. Developer surprise isn't a problem for a work-in-progress.

lhoward commented 8 years ago

Here is a patch to include type metadata in a new section, in the case where it is not already included in the protocol conformance table.

https://github.com/apple/swift/pull/834/files

I also (per an earlier comment from @jckarter) reverted the API consumed by Foundation to _typeByName() -> Any.Type?. On reflection (no pun intended), maybe it's better to commit to a final signature even though the functionality will be explicitly limited for now.

lhoward commented 8 years ago

Thank you to @jckarter for merging this. Confirmed Foundation builds and tests out with the current Swift compiler/runtime from master. @phausler, let me know what I can do to help you get the lhoward/nscoding branch merged.

lhoward commented 8 years ago

Merged to master.

lhoward commented 8 years ago

Foundation NSKeyedArchiver/NSKeyedUnarchiver tests pass on Linux with swift master (thanks to @gribozavr for fixing build regression I accidentally introduced!).

swift-ci commented 8 years ago

Comment by admin (JIRA)

(updating fields only)