apple / swift-protobuf

Plugin and runtime library for using protobuf with Swift
Apache License 2.0
4.57k stars 449 forks source link

Proposal: Nest generated message types to match proto package structure #750

Open dduan opened 6 years ago

dduan commented 6 years ago

From the general convention section of API Design Guideline:

Follow case conventions. Names of types and protocols are UpperCamelCase. Everything else is lowerCamelCase.

Currently, symbols such as Google_Protobuf_BoolValue and Google_Protobuf_Syntax vended by this project doesn't follow this Swift naming convention. Further, the generated types from protos have the same issue.

I propose these names be deprecated. Instead the symbols can be exposed under a namespace like so:

public enum GoogleProtobuf {
  public struct BoolValue {
    // …
  }
}

To maintain backward compatibility, typealias the new names to the old:

typealias Google_Protobuf_BoolValue = GoogleProtobuf.BoolValue

…until the next major release (or add deprecation warnings in next release, and drop the aliases in the one after).

The Swift code generator's behavior should be updated accordingly in a consistent manner. A proto Z in package X.Y should become

public enum XY {
  public struct Z {
    // …
  }
}

Nested protos can stay the same (no prefix).

There are a few wrinkles with the currently vended values: Google_Protobuf_Any cannot become

public enum GoogleProtobuf {
  public struct Any {
    // …
  }
}

…since Any is a Swift keyword.

Google_Protobuf_Type has similar problems as .Type is a magic property for metatypes. In these cases some alternative names such as AnyMessage and MessageType may be needed.

tbkka commented 6 years ago

We considered this early on but rejected it in favor of the current design:

People can easily define their own typealiases in their Swift code if they don't like the current naming, or can use option swift_prefix in their .proto file to alter the generated type names.

dduan commented 6 years ago

Hi @tbkka

Namespacing in this fashion is not common in other Swift projects.

I think the current naming convention is not used in other Swift projects. Using an outer enum or struct as namespace is very common in Swift projects.

two conflicting definitions of enum GoogleProtobuf

can you elaborate on this? I did use two declarations in the proposal, of course in implementation there would be only one. Extensions can be used.

tbkka commented 6 years ago

A single project will often have many .proto files defining many different messages in the same namespace. Those .proto files can be converted by running protoc several times; each time with a subset of the files. So there's no easy way to ensure that you end up with only one definition with multiple extensions. This gets more complex if you have protos in multiple modules.

kastiglione commented 6 years ago

So there's no easy way to ensure that you end up with only one definition with multiple extensions.

Could protoc isolate the the namespace structures to their own file with nothing else in them, with a stable known name? If so, I'm wondering if each protoc invocation could just generate the file anew, and if it happens to overwrite a previous run, the content will be the same and nothing will break.

thomasvl commented 6 years ago

To easily integrate with most build systems, you only want to generate output files where there names can be derived from the names of the input files (not from the content of the files themselves). So using the package from within the file, while possible, would make hooking into most build systems version difficult.

Even if you always created a single file with all the namespacing enums, you still can run into trouble with different protos being generated at times, but they could have overlap in namespaces. So one invoke could create a namespacing file with Foo and Bar and another might make Foo and Baz. So you're back two multiple source files trying to both create Foo.

tbkka commented 6 years ago

@kastiglione: Your idea basically would involve generating exactly the code we build today and then having a separate file that looks like this:

public enum Google {
    public enum Protobuf {
        typealias BoolValue = Google_Protobuf_BoolValue
        typealias IntValue = Google_Protobuf_IntValue
        ... etc ...
    }
}

This is a good idea, but we cannot reasonably build and maintain it automatically. In particular, it doesn't work when you have protos that reside in the same proto namespaces but are part of separate Swift modules that are developed and compiled separately. This happens quite often in practice. In particular, there are a lot of protos that use the google namespace (Google has released a lot of open-source software, much of it uses protobuf); some of them are in the SwiftProtobuf library but people compile others into their own projects.

thomasvl commented 6 years ago

btw - C++ avoids this with its namespace concept, because it really is just about scoping naming, and doesn't have the side effect of creating a type like using a Swift enum does. I believe there was a relatively recent thread on the Swift forums about doing something like a namespace keyboard, but I don't think it went too far.

dduan commented 6 years ago

There seems to be some failure of commincation on my part. So allow me to elaborate on how I think this could work with an example.

Given the following proto.

package Google.Protobuf;

message Person {
  required string name = 1;
  required int32 id = 2;
}

Two .swift files will be generated. The first one contains declaration of the namespace, note that in this design the levels in namespace are concatanated into a single name (similar to joining them with _). The alternative is to have matching levels of nestedness. I don't have strong opinions on which is better.

public enum GoogleProtobuf {}

The second .swift file contains the definition for this proto, declared inside the namespace from the first file:

extension GoogleProtobuf {
    public struct Person {
        // definition
    }
}

typealias Google_Protobuf_Person = GoogleProtobuf.Person // for backward compatibility

Other protos under the same definition will cause the first file to be generate repeatedly with the identical content.

Now, the generator needs to gain a way to specify where the namespace declaration is repeatedly generated. This way users can ensure this file is canonical within a Swift module . This is the enhancement we needed.

If the same package is declared in different modules, users can distinguish them with module name.

dduan commented 6 years ago

Does the above address your concerns, @tbkka?

tbkka commented 6 years ago

Thank you for clarifying your idea.

The reason we separated namespace components with underscores is to avoid conflicts when users tried to use protos with overlapping namespaces. The following really does happen in practice:

// File 1
namespace Foo.Bar;
message Baz { ... }

// File 2
namespace FooBar;
message Baz { ... }

With your convention, these would have conflicting names. To resolve this, you either need to use nested namespaces or use the same convention we originally developed that used _ to separate namespaces.

thomasvl commented 6 years ago

Even if you have that first file generated into a name passed via an option to the compiler, I don't think it scales out into a full build well. You can have different subsystems using protos that have overlap in their namespaces (one is "foo.bar" and the other is "foo.baz"), because they are in different subsystems, they tend to come up in different parts of the build, so both will end up trying to declare the "enum foo" part, and you'll end up with duplicate types.

To the point before of different Swift modules, and say foo.bar and foo.baz being in different modules, os there isn't a conflict, the reality is the second module likely will come along later as the app evolves. That will mean that any higher level code that happen to need symbols from foo.bar didn't have any module's on the types to qualify it, so the moment the foo.baz code comes in (even via a different module), all the existing code had to be touched to qualify it, so we have effectively broken the code forcing the developer to edit it all because they are trying to use another proto.

Historically we've spent a lot of time trying to come up with a way to use nested types for scoping, but when you factor in that multiple files do reasonably belong in the same package, and they can get added with time, the limitations of enums and how overlap between packages can cause problems, we've never been able to find a way to make this work as a project will evolve with time.

tbkka commented 6 years ago

Based on the discussion so far, I'm still not convinced the additional complexity justifies the benefits.

Benefits

Costs

Are there other benefits I've overlooked?

ryanrhee commented 6 years ago

FYI, https://github.com/apple/swift-protobuf/issues/759 seems to stem at least partly from this design decision - and that's most certainly a bug.

tbkka commented 6 years ago

FYI: Enum case naming is an entirely separate issue.