Closed lhoward closed 8 years ago
I think that the mapping of Foundation classes is totally reasonable provided that is the limitation of mapping. Asking user level code to be modified for that mapping is where I start to have reservations about compatibility/portability of code.
So far the pull request does not have Foundation specific info here besides that it is claimed as an SPI.
Per the commentary of `@objc` syntax; I think we can safely think of that as a separate issue and I tend to agree that if you buy into something that is not implemented on linux then it is reasonable to expect that you need to make affordances for that compatibility. (so 100% in agreement with you there)
So I think we are roughly on the same page? That my example is something we need to match and this is a good start to it?
Consider a third-party application with classes that inherit from NSObject. On platforms using Corelibs Foundation, these will be native Swift classes. True, the Objective-C classes have been replaced by Swift classes, but is it congruent with Corelibs Foundation's portability objectives to require the developer to care?
Anyway, I'm not entirely clear where I should go from here. Obviously (even if it pains me a little!) putting a static mapping into Foundation is easy.
But what to do about third-party consumers of the Foundation archiving API? Do we say they're completely unsupported (for now)? Do we support it only with mangled names and if so, will stdlib take an API for looking up a type by mangled name (and converse)?
I still don't understand the confusion. If class Foo is a subclass of a Foundation class, that doesn't affect how Foo's name is encoded at all, only what classes are used as potential fallbacks. (And it's not even really correct to use every superclass as a fallback.)
The way I see it, only the parts of the API that deal with names (*className*
) have to worry about any of this anyway, and then yes, of course you have to use the mangled name, because that's how runtime names work in Swift. (That would almost be a reason not to special-case top-level classes, but I guess it's too late for that.)
I guess that does mean that any decision we make about top-level non-generic non-private classes does have to be publicly documented.
I am just going to rip out the dynamic lookup code from NSKeyedArchiver/NSKeyedUnarchiver, that seems like the simplest solution.
Re: the superclass hierarchy has no bearing on the encoding of Foo's name, noted – I was confused because I (incorrectly) thought you were proposing on Linux that the mangled name was always used.
I am still unclear as to
whether you think stdlib should have an API for looking up a Type given a mangled name in the near term
if Foundation encoding of top-level classes is to use the unmangled name, who is responsible for looking up a Type given a top-level unmangled name
Here is an outline of what Foundation needs to do the correct thing:
// in module Foo
class Bar : NSObject {
class Baz : NSObject { }
}
class AnotherClass { }
@objc(Squirrel)
class Белка: NSObject { }
print("\(NSStringFromClass(Bar.self))")
print("\(NSStringFromClass(Bar.Baz.self))")
print("\(NSStringFromClass(AnotherClass.self))")
print("\(NSStringFromClass(NSTask.self))")
print("\(NSStringFromClass(Белка.self))")
The reasonable case
NSStringFromClass(Bar.self) -> "Foo.Bar"
If we get this working; great, if not, we can deal with it later. We could even forbid this type for an initial implementation and it would still be a win in my book. Or we could claim the darwin behavior a bug. So this honestly is up in the air.
NSStringFromClass(Bar.Baz.self) -> "_TtCC3Foo3Bar3Baz"
Also relatively reasonable
NSStringFromClass(AnotherClass.self) -> "Foo.AnotherClass"
The specialized case for Foundation classes (part of the implementation of NSStringFromClass et al.
Foundation does this from inside of NSStringFromClass and NSClassFromString and NOT from the stdlib
NSStringFromClass(NSTask.self) -> "NSTask"
I don't think it is a goal to have archives encoded with Objective-C names be decodable in Swift by using the @objc syntax since this is not portable since there is no objc support on linux in swift.
NSStringFromClass(Белка.self) -> "Squirrel"
If we got these behaviors the NSStringFromClass and NSKeyedArchiver etc would behave exactly as they do with the Objective-C implementations. Since String(reflecting: T.self) will emit what we need to provide enough info the only other side is the class from a given name and only really in the cases of a class that adopts NSCoding (to solve the upper level issue).
So just to be clear of what I think should be done here in the runtime is we need a way to get a type given a fully qualified name:
_typeByName("Foo.Bar") -> Bar.self
_typeByName("Foo.AnotherClass") -> AnotherClass.self
_typeByName("NSTask") -> nil
_typeByName("Foundation.NSTask") -> NSTask.self
_typeByName("_TtCC3Foo3Bar3Baz") -> Bar.Baz.self // gravy if we can get it but not a needed item to make it work
_typeByName("Squirrel") -> Белка.self // Not needed in linux because there is no Белка class since the @objc won't compile there
Right, which is what _typeByName() as submitted in the PR does (not for the gravy case; it could also search mangled names, but perhaps that's better in a separate API). And note is was marked as Foundation SPI as in "good enough until we have a better solution".
@phausler I created the lhoward/nscoding-static branch with a Q&D static mapping, turning NSStringFromClass()/NSClassFromString() into a static lookup. I'd prefer a proper solution but I'd really like some guidance on what the interfaces between Foundation and stdlib should look like to avoid wasting effort.
Also to be clear, I was never proposing the NSStringFromClass()/NSClassFromString() behaviour be embedded in stdlib (and perhaps this added to some of the confusion).
What I was asking is:
Will stdlib take a patch to implement a name to Type function in the near term?
If so, what would the format of the name be?
If the format is a mangled (_Tt) name, and you agree that Foundation should continue to encode top-level classes with the unmangled (Foo.Bar) name, how do you propose one lookup a Type given such an unmangled top-level name?
Require static mappings in all cases
Provide a separate API to transform a top-level class name into a mangled name (if so, where should this live – Foundation or stdlib or somewhere else)
Provide a separate name to Type function that takes a top-level class name (which is ostensibly what is in the PR)
Here is a proposal. A stdlib interface (marked Foundation SPI) that allows the following:
protocol SomeProtocol {}
class SomeClass {
class Nested : SomeProtocol {}
}
let name = String(reflecting: SomeClass.Nested.self)
let persistentName = _persistentTypeName(SomeClass.Nested.self)
print("_typeName(SomeClass.Nested) == \(name)")
print("_persistentTypeName(SomeClass.Nested) == \(persistentName)")
if let type = _typeByName(name) {
print("_typeByName(\(name)) == \(String(reflecting: type))")
}
if let type = _typeByPersistentName(persistentName) {
print("_typeByPersistentName(\(persistentName)) == \(String(reflecting: type))")
}
which prints:
% ./typeName
_typeName(SomeClass.Nested) == typeName.SomeClass.Nested
_persistentTypeName(SomeClass.Nested) == _TtCC8typeName9SomeClass6Nested
_typeByName(typeName.SomeClass.Nested) == typeName.SomeClass.Nested
_typeByPersistentName(_TtCC8typeName9SomeClass6Nested) == typeName.SomeClass.Nested
This – that is, separate interfaces for both mangled and non-mangled type names – makes implementing the existing NSClassFromString()/NSStringFromClass() behaviour straightforward (see below). I'm trying to understand enough about the de/re-mangler to support generic classes.
If this interface isn't right, then I'd certainly welcome some ideas on how you see Foundation interfacing with the runtime in order to implement name to class (and class to name) lookup for archiving.
Edit: "canonical", "mangled", etc, alternatives to "persistent".
And here is how NSStringFromClass/NSClassFromString might be implemented:
func NSStringFromClass(aClass: AnyClass) -> String {
let aClassName = String(reflecting: aClass).bridge()
let components = aClassName.componentsSeparatedByString(".")
if components.count == 2 {
if components[0] == _SwiftFoundationModuleName {
return components[1]
} else {
return String(aClassName)
}
} else {
return _canonicalTypeName(aClass)
}
}
func NSClassFromString(aClassName: String) -> AnyClass? {
var aClass : Any.Type? = nil
if aClassName.hasPrefix("_Tt") {
aClass = _typeByCanonicalName(aClassName)
} else if aClassName.characters.indexOf(".") == nil {
aClass = _typeByName(_SwiftFoundationModuleName + "." + aClassName)
} else {
aClass = _typeByName(aClassName)
}
return aClass as? AnyClass
}
Nice! That looks like exactly what we need; I wish we didn't have to have a differing constant for the module name of Foundation but that is a whole different ball of wax.
Thanks @phausler. I've a patch on the way that will only support classes (at least for the name to type function), but I think it will be relatively straightforward (at least compared to _typeByName
) to add support for generics if I can get some tips on the demangling API.
I've updated the pull request here:
https://github.com/apple/swift/pull/834/files
In _metadataForMangledName()
, I can see it should be possible to demangle the name into the generic and specialised components and funnel this into swift_getGenericMetadata(), but I haven't quite figured out how to put it all together yet. Tips welcome. 🙂
has an example of archiving nested classes, which now works.
OK, I also have lookups of generic types working 🙂
_canonicalTypeName(GenericStruct<Nested, String>) == _TtGV8typeName13GenericStructCCS_9SomeClass6NestedSS_
_typeByCanonicalName(_TtGV8typeName13GenericStructCCS_9SomeClass6NestedSS_) == GenericStruct<Nested, String>
Spoiler alert: generic type metadata lookup does use dladdr()
so it requires public (on Linux also relocatable) symbols (which arguably makes the whole effort a bit pointless as we already rejected the approach of using dlsym()
generally). (If the generic metadata included a name this could be avoided.)
Doing _typeBy[Canonical]Name
for all types is going to require some careful thought, but I'm of the "something is better than nothing" school of thought for getting Foundation archiving bootstrapped.
Adding a cache would be good, it's a pretty expensive lookup if there are a lot of conformance tables.
Nested classes cannot be looked up by unmangled name, because both of these have the same unmangled name:
struct Outer {
class Inner : NSObject, NSCoding { /*…*/ }
}
class Outer {
class Inner : NSObject, NSCoding { /*…*/ }
}
It is very unlikely we can use fully-qualified names for anything but top-level non-private non-generic classes. We can attempt to continue jumping through hoops but they get progressively harder for much less benefit and more backwards-deployment concerns.
I'm pretty sure we'll want some kind of typeForName
in the stdlib. I'm waffling on whether the "Foo.Bar" support belongs in the stdlib or not. We use it on Apple platforms for things like associated files, too (you can name your interface file "MyApp.MainController.xib" and then not have to manually specify the file name inside the controller class), so wanting a "pretty yet dynamically resolvable" name isn't limiting to archiving.
A thought: as far as registration goes, if we only supported secure archiving on Linux we'd be set, because you always have to pass the classes you'll allow for a particular decoding session.
The current implementation of NSKeyedArchiver always encodes nested classes using the mangled name.
Edit: thanks for pointing out the issue with different containing types though. I hadn't thought of that.
per the caching; we could probably cache the accesses in Foundation for now if that makes it easier.
Per my comment from a few days ago, guidance on what APIs do and do not belong in stdlib is helpful. Recall Foundation only needs to be able to look up a class given a name.
Edit: If it's undesirable to have stdlib expose an API that returns the first type matching a possibly non-unique name, then we can compose the canonical (mangled) class name in Foundation and look that up instead. Let me know.
Recall: Foundation needs to deal with unmangled top-level class names for archive interoperability with Darwin.
Doing the caching in the stdlib implementation is fine, I just haven't done it yet.
In NSStringFromClass(), what's the safest way to check if a class is a generic – is it safe to look for "\<" in the unmangled string representation or can that character be escaped?
Not sure I understand the comment about registration. Are you suggesting we only need a function that provides the mangled name given a type? Or are you suggesting that Foundation only needs to support encoding, not decoding?
Providing classes for a decoding session only applies to secure coding and also does not require that the caller explicitly register a class name to class mapping.
Only providing encoding support runs counter to the goal of Foundation having feature parity on both platforms.
I really think you're doing this backwards. Assume mangled names, pattern-match the one safe case, and turn that back into the fully-qualified name. That's what libobjc does (search for copySwiftV1DemangledName
).
I don't remember how to get a mangled name for a type but I think that's in the stdlib somewhere. Or at least a bare runtime entry point.
The runtime should handle the core type lookup logic in any case. My concerns with using mangled names is that mangled names are bad API for reflective purposes, since they're unreadable and nontrivial to compose. We're already careful about changing the format produced by String(reflecting:)
because of compatibility concerns, and the format should already include a token to represent local scopes (and can be fixed to do so if it doesn't). You can compose by string interpolation fairly easily, which is important if you want "Module.Foo\<(typeB)>" or something like it. Substitutions (and soon, compression) make this composition expensive for mangled names. Maybe we can design some sort of abstract type grammar that the typeName/typeByName entry points trade in, but that has its own complications.
In reply to @belkadan, there is _swift_buildDemanglingForMetadata() in stdlib which the PR for this bug report exposes as _canonicalTypeName().
OK, what you say makes sense. I'll remove _typeByName() and put the transformation for the one safe case into Foundation.
@jckarter two issues with unmangled names:
Can we guarantee the name produced by String(reflecting:)
is always unique? See @belkadan's comment about nested classes with different containing types
For Foundation, are we OK with breaking compatibility with existing archives that encode the mangled name?
How about this then:
private func buildCanonicalNameForClass(aClassName: String) -> String? {
var name : String
if aClassName.hasPrefix("_Tt") {
return aClassName
}
var components = aClassName.bridge().componentsSeparatedByString(".")
if components.count == 1 {
components = [ _SwiftFoundationModuleName, aClassName ]
} else if components.count != 2 {
return nil
}
if components[0].isEmpty || components[1].isEmpty {
return nil
}
name = "_TtC"
if components[0] == "Swift" {
name += "Ss"
} else {
name += String(components[0].length) + components[0]
}
name += String(components[1].length) + components[1]
return name
}
internal func NSClassFromString(aClassName: String) -> AnyClass? {
guard let canonicalName = buildCanonicalNameForClass(aClassName) else {
return nil
}
return _typeByCanonicalName(canonicalName) as? AnyClass
}
> Providing classes for a decoding session only applies to secure coding
If we only supported secure coding for Linux (at least for now), this would be good enough to build a list of all possible class names you could encounter in the archive. I'm not saying this is an ideal solution, but it gets Foundation unblocked while we (Foundation and Swift runtime) design and figure out _typeForName
—which, AFAIK, we do not have an implementation plan for yet. (Using dlsym is not something we want to ship with.)
@jckarter and I talked this morning for quite a while about mangled vs. demangled names, and didn't quite manage to convince each other either way. Things that came up:
Just to address the first comment:
I don't really see point of one-way archiving. "All possible class names" is unbounded for third-party applications. A static class mapping for Foundation only is the less worse alternative here (which I am happy to leave to Apple to implement).
The current implementation does not use dlsym()
, it uses protocol conformance tables which is sufficient for Foundation (as long as the class directly conforms to a protocol, which unless it is a subclass of a codable class it will by virtue of NSCoding). This approach was suggested by @jckarter. (For classes with inherited conformance, explicitly conforming to a dummy protocol is a workaround. The implementation for generic classes does depend on dladdr()
, granted.)
Thanks for the analysis above, it all makes sense.
The only thing I would add is: libobjc/Foundation on Darwin uses a particular encoding of class names today (unmangled for top-level classes, mangled otherwise). Corelibs Foundation has an explicit goal of compatibility with Foundation on Darwin, as I understand it. So whilst I understand the desire for a long-term solution perhaps we need something that matches libobjc's behaviour in the interim, with whatever division of responsibility between stdlib and Corelibs Foundation deemed appropriate.
Yep, sorry, hadn't kept up. Thanks for the new implementation!
> "All possible class names" is unbounded.
…aargh, I forgot that the root object does not itself have a set of expected classes.
I added a cache but it uses StringMap which pulls in libLLVMSupport (and libswiftCore.dylib seems to be linked with -all_load). Not ideal: we should probably use a different implementation or somehow pull in StringMap.cpp directly.
We shouldn't link the runtime against any LLVM libraries. Can you use DenseMap instead, or the ConcurrentMap that's implemented in the runtime already?
Fixed to use ConcurrentMap.
Until such time we have a stable and unique de-mangled name format:
public // SPI(Foundation)
func _topLevelClassByName(name: String) -> AnyClass?
Edit: we could just call this typeByName and return Any but I think that is misleading, even though that's the eventual API we want.
I have updated the nscoding branch of Foundation to use this. I am still cleaning up the pull request for stdlib with a view to requesting it to be integrated after the 2.2 branch.
NSKeyedArchiver/NSKeyedUnarchiver Linux/OS X tests pass with this change.
One more thing: in the interests of minimizing developer surprise, we should find a way for this API to work where the class has no direct protocol conformance. If emitting dummy conformance records for every type is problematic, we could special case NSCoding. (In the Swift 3 timeframe perhaps a different approach to runtime metadata will exist, but for the interim.)
Foundation is using an explicit dummy protocol for codable classes that inherit NSCoding, but expecting developers to know the limitations of NSClassFromString might be unreasonable.
We can add that later. Developer surprise isn't a problem for a work-in-progress.
Here is a patch to include type metadata in a new section, in the case where it is not already included in the protocol conformance table.
https://github.com/apple/swift/pull/834/files
I also (per an earlier comment from @jckarter) reverted the API consumed by Foundation to _typeByName() -> Any.Type?
. On reflection (no pun intended), maybe it's better to commit to a final signature even though the functionality will be explicitly limited for now.
Thank you to @jckarter for merging this. Confirmed Foundation builds and tests out with the current Swift compiler/runtime from master. @phausler, let me know what I can do to help you get the lhoward/nscoding branch merged.
Merged to master.
Foundation NSKeyedArchiver/NSKeyedUnarchiver tests pass on Linux with swift master (thanks to @gribozavr for fixing build regression I accidentally introduced!).
Comment by admin (JIRA)
(updating fields only)
Additional Detail from JIRA
| | | |------------------|-----------------| |Votes | 1 | |Component/s | Standard Library | |Labels | New Feature, AffectsABI, Runtime | |Assignee | @jckarter | |Priority | Medium | md5: 13f1c2ae7174aab8983f1dde3a976c53blocks:
relates to:
Issue Description:
This may be useful to implement NSStringFromClass()-compatible behaviour which is necessary in order to implement NSKeyedArchiver. I am using the attached workaround at the moment but obviously it is very fragile.