More intelligent enums - Githubissues

GoogleCodeExporter commented 9 years ago

(this is a RFE, not a defect)

Currently if you define:

  enum Foo {
        NONE=0;
        FOO1=1;
     }
  enum Bar {
        NONE=0;
        BAR1=1;
     }

it won't compile because the same enum value, NONE, is used in two different 
enums within the same proto file.

This is by design: Protobuf recognizes that the above construct will not work 
in C++ and therefore rejects it (no matter the output language). See Issue #12. 
The reason for the name collision is due to the somewhat odd implementation of 
enums in C++.

For other languages, like Java, there is no name conflict in the generated code 
so it is unfair to punish all other languages for the shortcomings of C++.

The existing restriction is extremely annoying as there aren't really any good 
alternatives. Each alternative has its own problems. (alternatives would be :  
use one proto file per enum, embed enums inside a message, etc)

Let's be constructive: The big question is how to improve on this without 
breaking the millions of lines of C++ code that depend on the existing behavior?

This group discussion has the solution: 
https://groups.google.com/forum/#!topic/protobuf/d-AqClgnDKM (alopecoid's 
suggestion is the one I like, it is clean, clear and doesn't break any existing 
code)

I'll just re-iterate alopecoid's suggestion here.
The basic suggestion is to add an option to enums:

  enum Foo {
        option use_namespace = true;
        NONE=0;
        FOO1=1;
     }
  enum Bar {
        option use_namespace = true;
        NONE=0;
        BAR1=1;
     }

This option would default to 'false', meaning the exact behavior of protobuf 
compiler today. However if the option is true the compiler will no longer fault 
on the above construct (for any output language). For C++ it will then generate 
C++ code like this:

  namespace Foo {
    enum Enum {
      NONE = 0;
      FOO1 = 1;
    }
  }

  namespace Bar {
    enum Enum {
      NONE = 0;
      BAR1 = 1;
    }
  }

and the enums would then be referenced in C++ code like Foo::NONE and Bar::NONE.

For other languages that also use unscoped enums (like C++) the generated code 
for such language will have to find their own solution to the problem. (are 
there any such languages at all?)

For languages that use scoped enums (most languages that I can think of, 
example: Java) the option would have no effect. It is the same code being 
generated in both cases.

I like the suggestion because it doesn't break any existing code.

Original issue reported on code.google.com by peterhan...@yahoo.com on 27 May 2013 at 8:07

GoogleCodeExporter commented 9 years ago

I would say the option name is a little language specific, but the idea is 
good.  Maybe call it "scoped=true" or something similar and add a command-line 
protoc flag to make it the default.  I would also like to see the option to 
generate C++11 enum classes too, so the namespace feature would just be used 
for backwards compatibility for older C++ compilers.

Original comment by joew...@live.com on 30 Jan 2014 at 8:36

GoogleCodeExporter commented 9 years ago

The proposed solution doesn't work because:
1. The name of the enum type is changed from "Foo" to "Foo::Enum" which will be 
confusing to the users.
2. For enums nested in a message you simply can't nest a namespace in a class.

The current scoping semantics is a design choice made a long time ago and the 
value of allowing duplicated enum value names is also very limited. It doesn't 
seem to be a worthwhile effort to me.

Original comment by xiaof...@google.com on 30 Jan 2014 at 10:25

GoogleCodeExporter commented 9 years ago

Instead of `namespace Bar { enum ...`, we could use `enum class Bar ...', or 
perhaps `struct Bar { enum ...`.

For myself, I don't care much about duplicated enum value names, but I would 
certainly like to use scoped enums for style and clarity reasons.

I suppose the reasons for adding something like this to protobuf would be 
similar to the reasons for why `enum class` was recently added to C++.

Original comment by kara...@gmail.com on 30 Jan 2014 at 11:26

GoogleCodeExporter commented 9 years ago

I think the support for C++11 enum class is a good feature to have. We'll need 
to wait for some time though, as we currently still prohibit C++11 features in 
protobuf.

Original comment by xiaof...@google.com on 31 Jan 2014 at 1:22

GoogleCodeExporter commented 9 years ago

Why not put it in now with a flag or an #ifdef on the C++ version?

Original comment by cbsm...@gmail.com on 31 Jan 2014 at 5:25

GoogleCodeExporter commented 9 years ago

Disallowing C++11 features is more of a policy issue rather than a technical 
one. The policy will eventually change but so far I haven't seen any move in 
that direction.

Original comment by xiaof...@google.com on 31 Jan 2014 at 7:15

GoogleCodeExporter commented 9 years ago

>> The proposed solution doesn't work because:
>> 1. The name of the enum type is changed from "Foo" to "Foo::Enum" which will 
be confusing to the users.

The proposal text says the new option is  .. well, optional. So if you don't 
specify it protobuf compiler will (regardless of target language) generate 
exactly same code as today. How can that be confusing ? .. except for those who 
explicitly put in the new proposed option in the .proto file .. but then those 
people have made a conscious choice to use the new proposed feature.

>> 2. For enums nested in a message you simply can't nest a namespace in a 
class.

Not sure I understand. 

In any case, if you really believe so,  we could say in the documentation that 
the proposed new option has no effect for C++ and make sure that's what is 
happening. This would still bring great benefits to anyone else (most languages 
has the notion of namespace for enums) just not to C++ users.

There are already options in .proto files that only benefit some languages but 
not others so going down that route is not something new. It all depends if the 
design idea of protobuf is to be bound by the lowest common denominator, in 
this case C++ ? I certainly understand and appreciate the desire to keep 
protobuf design lean and mean so I would sympathize (somewhat reluctantly :-)) 
with such a conscious choice.

>> the value of allowing duplicated enum value names is also very limited. It 
doesn't seem to be a worthwhile effort to me.

What?  Perhaps we're just different but we run into these issues all the time 
in particular with enum values such as "YES", "NO", "NONE, "UNDEFINED", words 
that are likely to be used in different enums. Yes, there are ways around it .. 
but they just make protobuf code harder to read and make your setup more 
convoluted than what it needs to. And I love protobuf for the exact opposite 
reasons.

Cheers.

Original comment by peterhan...@yahoo.com on 31 Jan 2014 at 12:36

GoogleCodeExporter commented 9 years ago

Isn't it common practice to add UNKNOWN to be enum's 0th field to allow other 
side to pick it up as default value when there is a enum value that isn't their 
side but added in serialized message? This will instantly become issue when you 
add more than one UNKNOWN field in the enum that sibling each other.

Wrapping enum with another message looks ugly to me, and only simple way for 
this so far would be adding prefix for it, like 

enum CURRENCY {
  UNKNOWN = 0;
  NONE = 1;
  USD = 2;
  // etc...
}

. Is there any other simple way that recently introduced in Protobuf?

Original comment by wonhee.jung on 18 Nov 2014 at 1:09

GoogleCodeExporter commented 9 years ago

>> Wrapping enum with another message looks ugly to me, and only simple way for 
this so far would be adding prefix for it, like 

>> enum CURRENCY {
>>   UNKNOWN = 0;
>>   NONE = 1;
>>   USD = 2;
>>  // etc...
>>}

Don't you mean that a workaround would be to add prefix like this:

enum CURRENCY {
  CURRENCY_UNKNOWN = 0;
  CURRENCY_NONE = 1;
  CURRENCY_USD = 2;
  // etc...
}

?

Original comment by peterhan...@yahoo.com on 18 Nov 2014 at 8:14

kidok / protobuf

More intelligent enums #515