microsoft / thrifty

Thrift for Android that saves you methods
Apache License 2.0
544 stars 102 forks source link

Throw informative error for non-null but unrecognized enum values #84

Closed benjamin-bader closed 7 years ago

benjamin-bader commented 7 years ago

For example, say we have:

enum Foo { FOO = 1, BAR = 2 }
struct Baz {
  1: required Foo foo = Foo.FOO;
}

Let's say a client has this definition, but the server adds a new enum field QUX. Right now, clients receiving a Baz with the new enum value will crash. This is because the generated Foo.findByValue method returns null, and the adapter will attempt to set the field with that null value.

It seems wrong to attempt to overwrite a default value with null. Conversely, it also seems wrong to silently ignore incomprehensible data.

What to do?

benjamin-bader commented 7 years ago

It now seems to me that it is best to throw in this case, but earlier, when we fail to parse a recognized enum value - the real problem is that, from the client's perspective, the protocol is being violated.

wyland commented 5 years ago
 the real problem is that, from the client's perspective, the protocol is being violated.

This is not representational of maintaining backwards compatibility for additions to thrift interfaces. A client should be able to gracefully handle an addition to the interface when it is marked as an optional. Clients should not be forced into a mandatory upgrade just because an additional enum or union value was added.

Other thrift lang generators handle this scenario by adding an unknown case to generated enums/unions with the associated value that was parsed. This is the best way to maintain backwards compatibility without throwing an exception that prevents parsing an entire thrift transaction.

benjamin-bader commented 5 years ago

That's the trouble with Java enums - by definition they are a closed set, and members that are not part of that set just cannot be represented by that type.

I see your point, and agree that this design isn't forward-compatible. That said, I'm not sure I agree that it should be. The approach of forcing an extra enum member feels superbly hacky, and at odds with the common (but not universal) convention of defining an "unknown" member by hand. We could try to detect "unknown", but I don't believe we could do so adequately.

protobuf has the same issue, FWIW, except that since all their fields are optional (as of proto3) it isn't quite the dealbreaker, transactionally speaking, that required enums are in thrift.

Lastly, I'm curious - is this a use-case that impacts you? Are you able to avoid the issue by making the field itself optional?

wyland commented 5 years ago

Thanks for the followup

Actually we are seeing this exception getting thrown on an optional enum field in a struct which is stopping the entire deserialization process - it appears this is regardless of a field being marked required versus optional.

While I still think it's important to treat null as a null and not unknown which has a different meaning, I would also agree with your thoughts on set theory and this feeling hacky. A random idea would use a wrapper to handle this with two possible cases. This could be used when a field is marked required or optional, a null would still be a null in the optional case:

In Thrift this could be supported in any generated lang with something like this, but would require some overloading of the read protocol to leverage it properly:

union Wrapper {
  1: Enum some,
  2: Int32 unknown
}

Or for just the Java/Kotlin generator, this could automatically be implemented with something like:

enum WrappedEnum {
  SOME, UNKNOWN;
}

abstract class Wrapper {
    abstract public WrappedEnum getType();

    public static final Wrapper newValue(Enum enum) {
        return new WrappedValue(enum);
    }
    public static final Wrapper newUnknown(Int value) {
        return new WrappedUnknown(value);
    }
}

class WrappedValue extends Wrapper {
    private final Enum value;

    public WrappedValue(Enum value) {
        this.value = value;
    }
    public Enum getValue() {
        return value;
    }
    @Override
    public WrappedEnum getType() {
        return WrappedEnum.SOME;
    }
}

class WrappedUnknown extends Wrapper {
    private final Int value;

    public WrappedUnknown(Int value) {
        this.value = value;
    }
    public Int getValue() {
        return value;
    }
    @Override
    public WrappedEnum getType() {
        return WrappedEnum.UNKNOWN;
    }
}

In other generator languages that support associated values in their enums, this wouldn't require so much boilerplate code to reproduce the "only 1" requirement of a C++ union.

Thoughts?

benjamin-bader commented 5 years ago

Right, my mistake - of course you'd see that on optional fields because we're throwing as soon as we encounter an unknown value. I forgot about that!

I'm not yet willing to change the current behavior by default - I know that I added this to work around a bug in Outlook, and changing this would probably introduce regressions. How about this - we could, if a suitable commandline option were given, fall back to the field's default value (if specified), or null (for optional fields). This wouldn't help required fields with no default, required enum fields are begging for trouble anyways.

Unrelated, but I was pretty surprised by your claim that thrift generators insert "unknown" enum members. I didn't remember seeing that when studying the apache compiler codebase, so I just looked again. At least with Java, that doesn't seem to be true. What generators have this behavior?

wyland commented 5 years ago

As of latest dev branch, there are command line options for this on Swift using safe_enums, and reading the help documentation I believe it could also be supported in a wrapper class on c++, c#, and netcore, although I haven't actually tested these generators. It sounds like that is somewhat inline with the informal convention you were describing earlier.

I think your suggestion is the best bet to not introduce issues for everyone, gracefully handling the optional enum and union using an CL option.

benjamin-bader commented 5 years ago

I didn't make this clear earlier, and I apologize for that, but I can't promise you any timely action on this. My free time is virtually nil these days, and other work takes priority. If you're feeling adventurous, I'd welcome a PR. Otherwise, thanks for your patience.