Open NachoSoto opened 7 years ago
Can you describe in more detail why you to be able to translate strings to enum cases outside of the already provided text-format/JSON encoding/decoding processes?
Yup. In my particular example, I have this enum
:
enum NotificationCategory {
PUSH_NOTIFICATION_CATEGORY_1 = 0;
PUSH_NOTIFICATION_CATEGORY_2 = 1;
// ... etc
}
Then, from my backend, I'm sending notifications to Apple's APNS with that category ("PUSH_NOTIFICATION_CATEGORY_1"
, etc). In the client I can then easily get the category from the push notification body, and deserialize it that way.
I could of course send the rawValue
s instead, and change my app .plist
to numbers, but that would obscure the meaning of these categories, so it would be a lot better if I could continue sending these raw String
s.
My point of view is going to come as that of an API purist. 😄
If we consider the main purpose of protocol buffers (fast small binary encoding with type-safe codegen in your favorite language), then the enum case name that you have there is intended for two main purposes: the basis of the generated symbol used in Swift source code, and the string used to represent the value in text-format- and JSON-encoded representations of a message containing it. Relying on the enum value name as a string in isolation, if you can't transfer a full message payload in text format or JSON that can parsed through the public APIs, is kind of skirting the edge of the intended use cases.
One of the goals of protobuf is to produce idiomatic code for its languages. Swift enums traditionally don't provide a string/value mapping of their own and don't require it to be useful. The relationship between the proto name and the Swift name isn't terribly relevant to pure Swift code; it's external protobuf metadata.
One could also argue that sending the numbers is better bandwidth-wise than the names, but I can sympathize with the desire to have more readable names in files like plists that aren't generated from .proto files. For particular enum where you want a more meaningful representation, you can provide your own string mapping, but you also have to keep that manually in sync if you update your protos.
So, in our efforts to slim down the public API to what makes sense for Swift users, we reduced the enum API to only the cases and their raw values. This also serves another purpose—as a longer term goal we've been hoping to refactor the way generated fields/enum values expose their names so that we could provide an option to remove those strings completely from the code for users who only needed binary encoding/decoding (#18). There are big wins there for the mobile use cases with respect to binary size.
The way that other languages would grant you access to this metadata is through descriptors (more or less reflection for protos). For example, the Objective-C type that's generated for a proto is just a standard C enum
, so the name mapping isn't available there; the runtime provides GPBEnumDescriptor
instead which has the API for that. But we intentionally chose not to emit descriptors because of the bloat they would add to clients†.
So... all this philosophical stuff because of enum strings. 😄 My thinking is, if we want to support this, the right way to do it would be to emit those descriptors—perhaps as a command-line option so users have to opt-in to the bloat—and people who need reflective capabilities can use those. This isn't on the 1.0 milestone track, and we haven't discussed it too much—we certainly haven't designed it for Swift yet.
@thomasvl, you and I have had a lot of conversations about descriptors and you have more experience in this area. What are your thoughts?
(† Of course, it would be interesting to see how good swiftc
and the linker do here. If we generated descriptors internal
and you only referenced one of them, would the rest get dead-stripped in the final executable?)
First up, Descriptor
s in the C++ api, and are just the proto objects, they actually are something built out of the proto data so they can be more of a graph of objects to show relationships. The ObjC were along those lines. This means they aren't cheap, they have the data cost as well as a runtime hit to build them up when access.
Descriptors also have a major down side to some customers, the bundle everything into the binary needed to know how to decodes the Apps wire formats. Our visitor pattern likely comes close to including the same data, but it isn't as easy to data mine as waking static data of a binary to find descriptor sets set.
To the original question, I'm still missing something. So you've got a push notification in Apple's format, and you want to transform it into a proto message? I'm assuming there is more than just an single enum value here. So it sounds like you likely have to do custom code anyways, so while the enum api might have been convent, it sounds like it was more chance than a real complete solution. So if custom code is already there, doing something to manage your name mapping doesn't seem like that big of an issue. Certainly comparing to potentially adding a public api to every enum to bloats the code for all users?
This means they [descriptors] aren't cheap, they have the data cost as well as a runtime hit to build them up when access.
I've heard some complaints about the run-time data size of using descriptors, especially from iOS developers who have to deal with sometimes severe limits on data space (iOS can can demand-page code, but does not swap data space, so data space is actually much more valuable than code).
@NachoSoto: We have pieces of what you need internally -- the JSON and Text encoding/decoding engines do need this text information. The tricky part is working out the right API. In particular, we're trying to ensure the right behavior if the Enum does not support ProtoNameProviding
.
@NachoSoto: We have pieces of what you need internally -- the JSON and Text encoding/decoding engines do need this text information
Right, that's what my thinking was: it shouldn't add any extra size, right? IMO this is just a matter of picking the right API, like you said. I do understand your comments about discouraging using this data, and it's really not a big deal for me to switch this use case to using Int
s instead. But I imagine I won't be the only one that will need this in Swift, since it is available in other languages.
Any update on this?
When you say it is available in other languages, can you point to examples? For C++, I think you've have to use full on TextFormat and can't do it on a per enum case without going to the Descriptor. Exposing something as complete as the Descriptor is what we are really holding off from for a 1.0 release.
While I can appreciate your need for something like this to improve the legibility of some other parts of your code base where you refer to enums outside of a protobuf encoding context, I'm still inclined to hold my initial reasoning here:
But, as @thomasvl mentioned, we're holding off on descriptors for 1.0 because they add bloat that many people don't need (and they're an all-or-nothing proposition), and I would very much like to not scatter bits and pieces of descriptor-like functionality as public API throughout the library to satisfy individual use cases that pop up. If we want to support these transformations, we should do it the right way.
In the meantime, I would recommend providing your own string-to-enum transformation or using the numeric values.
@NachoSoto: If I understand correctly, you're basically asking for a readable text serialization mechanism so you can put textual enum values into some other serialization mechanism (plist).
You might consider using the protobuf "Text Format" serialization for this instead of just the bare enum name. If you have a message with a single field containing the enum, like this:
enum NotificationCategory {
PUSH_NOTIFICATION_CATEGORY_1 = 0;
PUSH_NOTIFICATION_CATEGORY_2 = 1;
// ... etc
}
message Notification {
NotificationCategory category = 1;
}
Then if you construct a Notification
message and serialize it using textFormatString()
you will get this:
category: PUSH_NOTIFICATION_CATEGORY_1
You could then store this entire string (including the field name) in your push notification and deserialize it at the other end using init(textFormatString:)
. The format used by textFormatString()
is really intended for debugging use, but it is a stable, robust, and performant format that has been used for many other purposes.
As a bonus, this would allow you to expand the Notification
message here in the future with additional information as necessary.
Thanks for the responses!
It's not just this particular feature, it's also a bunch of places in our web that need to continue supporting these string enum values for backwards compatibility, because we haven't migrated JS to use Protobuf, and we won't any time soon.
When you say it is available in other languages, can you point to examples?
For example in Java Enum#valueOf
, or the fact that the JSON serialization uses this format, so being able to deserialize it allows maintaining compatibility with other systems that are using the same data representation but not necessarily Protobuf.
In the meantime, I would recommend providing your own string-to-enum transformation or using the numeric values.
I'll do that in the meantime 👍
But, as @thomasvl mentioned, we're holding off on descriptors for 1.0 because they add bloat that many people don't need.
Any idea what the timeframe for that would be after 1.0?
In case anyone is curious, this is the approach I went with for now:
import SwiftProtobuf
// See https://github.com/apple/swift-protobuf/issues/326 for why this is necessary
public protocol ProtobufEnum: SwiftProtobuf.Enum {
init?(identifier: String)
var identifier: String { get }
static var allValues: Set<Self> { get }
}
Quick
import Quick
import Nimble
import SwiftProtobuf
class ProtobufEnumBehavior<T: ProtobufEnum>: Behavior<Set<T>> {
override class func spec(_ allValues: @escaping () -> Set<T>) {
for value in allValues() {
it("Serializes identifier for \(value)") {
expect(T(identifier: value.identifier)) == value
}
}
}
}
I was relying on this constructor, and it's now gone (see https://github.com/apple/swift-protobuf/pull/320/files#r103062915). Was there a reason for that? Would it be possible to bring it back?
Thanks!