danielgtaylor / python-betterproto

Clean, modern, Python 3.6+ code generator & library for Protobuf 3 and async gRPC
MIT License
1.45k stars 196 forks source link

Naming collision with nested enums #212

Open qria opened 3 years ago

qria commented 3 years ago

Problem

Currently these two enums in the same file will be both compiled into the same name of ContentStatus

message Content {enum Status{ ... }}
enum ContentStatus{ ... }

Currently the behaviour is to silently overwrite one of the enums, therefore making all subsequent call of the enum possibly wrong.

Workaround

I have consulted @nat-n and they say they don't see a way to work around the issue currently, except to directly modify the generated code.

nat-n commented 3 years ago

The ideal solution for this would be a (breaking) change for nested enums to instead compile to nested classes in python instead of relying on naming concatenation.

qria commented 3 years ago

What is the policy on breaking changes of this project?

cetanu commented 9 months ago

I am looking into this, I note it's still a problem and was able to generate an example of it.

cetanu commented 9 months ago

I might need a tl;dr on how the enum ends up getting named and which python module in the generated tree it ends up in.

If I look at how this case is handled in other libraries/languages, I see that they put the nested enum into a module with the name of its parent

So in the tree there would be:

I am currently reading through betterproto/plugin/models.py and betterproto/plugin/parser.py to try and understand how the decision is made to put the enum Status in the same file as the message Content. Cannot say I have reached comprehension as yet.

cetanu commented 9 months ago

I've reached some understanding after looking at the OutputTemplate that is created for this nested enum example.
The avenue of creating another file, while not impossible (and IMO the preferable way to do this), will probably take too much work. The suggestion by @nat-n to make a nested class is probably the way to go, and is how I am going to approach this. Will make an attempt soon.

cetanu commented 9 months ago

This is much harder than I thought...

I've managed to turn this protobuf

syntax = "proto3";

package nested_enum;

message Test {
    enum Inner {
        NONE = 0;
        THIS = 1;
    }
    Inner status = 1;

    message Doubly {
        enum Inner {
            NONE = 0;
            THIS = 1;
        }
        Inner status = 1;
    }
}

message TestInner {
  int32 foo = 1;
}

message TestDoublyInner {
  int32 foo = 1;
  string bar = 2;
}

enum Outer {
    foo = 0;
    bar = 1;
}

message Content {
    message Status {
        string code = 1;
    }
    Status status = 1;
}

message ContentStatus {
    int32 id = 1;
}

into this code

@dataclass(eq=False, repr=False)
class Test(betterproto.Message):
    @dataclass(eq=False, repr=False)
    class TestDoubly(betterproto.Message):
        # It would be nice if I could strip the prefix from these class names
        class TestDoublyInner(betterproto.Enum):
            NONE = 0
            THIS = 1

        # this type hint is wrong; it is referencing the global TestDoublyInner
        status: "TestDoublyInner" = betterproto.enum_field(1)

    class TestInner(betterproto.Enum):
        NONE = 0
        THIS = 1

    status: "TestInner" = betterproto.enum_field(1)

@dataclass(eq=False, repr=False)
class TestInner(betterproto.Message):
    foo: int = betterproto.int32_field(1)

@dataclass(eq=False, repr=False)
class TestDoublyInner(betterproto.Message):
    foo: int = betterproto.int32_field(1)
    bar: str = betterproto.string_field(2)

@dataclass(eq=False, repr=False)
class Content(betterproto.Message):
    @dataclass(eq=False, repr=False)
    class ContentStatus(betterproto.Message):
        code: str = betterproto.string_field(1)

    status: "ContentStatus" = betterproto.message_field(1)

@dataclass(eq=False, repr=False)
class ContentStatus(betterproto.Message):
    id: int = betterproto.int32_field(1)

class Outer(betterproto.Enum):
    foo = 0
    bar = 1

See the inline comments for two of the problems I'm having so far.

cetanu commented 9 months ago

I've parked my progress in a branch. If someone else could give it a glance that would be good, maybe I'm off-track or approaching this the wrong way.

https://github.com/danielgtaylor/python-betterproto/commit/b9ec59f2a3e95afe5163197096adb91ade7ec36b

nat-n commented 9 months ago

Hey @cetanu, nice work, you're close.

In my view, the ideal approach is for the Python module/class structure to mirror that of the proto config. Unless I'm missing something following this principle should have the most elegant and least surprising or problematic result.

So just to be clear I think what we want is for the Test class to look like so:

@dataclass(eq=False, repr=False)
class Test(betterproto.Message):
    @dataclass(eq=False, repr=False)
    class Doubly(betterproto.Message):
        class Inner(betterproto.Enum):
            NONE = 0
            THIS = 1

        status: "Test.Doubly.Inner" = betterproto.enum_field(1)  # Note the use of fully qualified class name here

    class Inner(betterproto.Enum):
        NONE = 0
        THIS = 1

    status: "Test.Inner" = betterproto.enum_field(1)

I guess you agree?

So the problem is how to avoid using concatenated class names? Do you know where the concatenation is taking place? Could it be refactored to represent class identification with an object that retains the structure so it can be formatted in a context appropriate way? (e.g. a tuple subclass, sometimes so you can use take self[0], sometimes you take ".".join(self) ).

I'm afraid making progress in this codebase requires a willingness to refactor :P

I don't have time/motivation to really dig through the code right now, but I'm happy to offer this kind of high level collaboration.

cetanu commented 9 months ago

I agree completely, I just haven't quite worked out:

  1. where the names are concatenated
  2. where the type hints are also formed However what you've put as an example is the ultimate goal.