Extensible enums - Githubissues

blast-hardcheese commented 6 years ago

For interoperation with poorly specced or flexible downstream APIs, clients should be able to represent a branch of an enumeration that isn't known. This permits accepting unknown branches and doing work with them without rejecting the response from client calls (currently, enumeration lookup failures result in a decoder failure for clients).

Currently, generated enumerations look like this:

sealed abstract class Foo(val value: String) { override def toString: String = value.toString }
object Foo {
  object members {
    case object Bar extends Foo("Bar")
  }
  val Bar: Foo = members.Bar
  val values = Vector(Bar)
  def parse(value: String): Option[Foo] = values.find(_.value == value)
  implicit val encodeFoo: Encoder[Foo] = Encoder[String].contramap(_.value)
  implicit val decodeFoo: Decoder[Foo] = Decoder[String].emap(value => parse(value).toRight(s"$value not a member of Foo"))
  implicit val addPathFoo: AddPath[Foo] = AddPath.build(_.value)
  implicit val showFoo: Show[Foo] = Show.build(_.value)
}

A possible alteration to accept unknown members (currently unsure if this should be opt-in or opt-out, definitely needs more thought):

@@ -2,10 +2,12 @@
 object Foo {
   object members {
     case object Bar extends Foo("Bar")
+    class Unknown private[Foo] (value: String) extends Foo(value)
   }
   val Bar: Foo = members.Bar
+  def Unknown(value: String): Foo = new members.Unknown(value)
   val values = Vector(Bar)
-  def parse(value: String): Option[Foo] = values.find(_.value == value)
+  def parse(value: String): Option[Foo] = values.find(_.value == value).orElse(new members.Unknown(value))
   implicit val encodeFoo: Encoder[Foo] = Encoder[String].contramap(_.value)
   implicit val decodeFoo: Decoder[Foo] = Decoder[String].emap(value => parse(value).toRight(s"$value not a member of Foo"))
   implicit val addPathFoo: AddPath[Foo] = AddPath.build(_.value)

Initially I thought the Unknown branch should not expose a constructor, as it would prevent invalid data from being supplied by the clients into downstream services. This has the unfortunate side-effect of not allowing services to round-trip to persistence layers; consider receiving values of an array via client API call, writing those values into a database, then attempting to use those values to make future API calls.

blast-hardcheese commented 6 years ago

Another, simpler strategy until this is resolved could be to shim Either[Foo, String], but there are two problems: 1) $refs can't have vendor extensions, preventing the use of x-scala-type 2) x-scala-type would have to be littered everywhere

kelnos commented 3 years ago

One thought I had with this is that we probably want to allow an unknown type for clients, but not for servers (a server should never return an un-enumerated value). Unfortunately model generation still doesn't know if the generation is for a client or server, and often services might share model generation for the same client and server.

I'm leaning toward this being opt-in, though at what level... If it's a guardrail command-line option, that lets the spec consumer decide what they want to do. But a consumer that is using someone else's API may simply not know that they should be using that option; the spec author is in the best position to know if an enum is likely to be added to in the future. But if this toggle goes into the spec, we still have the client vs. server knowledge/sharing issue at generation time.

Another thing to think about is what if a legitimate enum alue, specified in the spec, is unknown? Then we'd have a name conflict when generating class/member names.

blast-hardcheese commented 3 years ago

Gah, yes, I forgot about the server/client side of this problem. Permitting servers to return unexpected values would be undesirable. I can think of situations (like generated proxies) that would benefit from this feature being present in the server as well, but it's certainly an edge case.

Another thing to think about is what if a legitimate enum alue, specified in the spec, is unknown? Then we'd have a name conflict when generating class/member names.

Maybe whatever feature enables this does so via specifying the name desired to be used? Like ScalaClient(..., extensibleEnumName="Unknown")

kelnos commented 3 years ago

Maybe whatever feature enables this does so via specifying the name desired to be used? Like ScalaClient(..., extensibleEnumName="Unknown")

Yeah, that could work. And we could have a default if it's not provided, and bail if there's a conflict.

sullis commented 3 years ago

fyi, I am aware of prior art in two open source projects:

1) ApiBuilder 2) AWS SDK for Java v2

ApiBuilder uses UNDEFINED as the marker for unknown enumeration values:

https://github.com/apicollective/apibuilder-generator/blob/main/scala-generator/src/main/scala/models/generator/ScalaEnums.scala

Here is example output from the ApiBuilder Scala generator:

https://app.apibuilder.io/flow/delta/0.8.36/scala_models/FlowDeltaV0Models.scala

AWS SDK for Java (version 2) defines a special Java enum value called UNKNOWN_TO_SDK_VERSION

https://github.com/aws/aws-sdk-java-v2/blob/master/codegen/src/main/java/software/amazon/awssdk/codegen/poet/common/AbstractEnumClass.java#L56

blast-hardcheese commented 3 years ago

yeah, that's a good point. In Java, can't we define an enum with a held value? I thought Java supported ADT-style enums, @kelnos ?

kelnos commented 3 years ago

In Java, can't we define an enum with a held value? I thought Java supported ADT-style enums, @kelnos ?

Unfortunately not, but you can kinda simulate them with an abstract class and static final instances for the fixed values, with a regular class for the unknown value. i'd probably do something like this (especially this in order to maintain source compat):

public abstract class MyEnum {
    public static final MyEnum SOME_VALUE = new MyEnum("some-value);
    public static final MyEnum ANOTHER_VALUE = new MyEnum("another-value");

    public static final class Unknown {
        public Unknown(String value) { super(value); }
    }

    private final String value;

    private MyEnum(String value) { this.value = value; }
    public String getValue() { return this.value; }
}

I would also add the methods that come for free from java.lang.Enum, like valueOf(), equals(), etc., but unfortunately you can't use enum, as java enums are not subclassable.

I think implementation-wise we're fine with one caveat: javac special-cases switch statements with enums: I think you can't use qualified names for the case statements, and you also don't need to import the individual values (even though you aren't qualifying them), so that could be a slightly breaking source change (fixable by the consumer adding another import).

To me the only issue is that of handling the feature in guardrail:

Is this feature opt-in or opt-out? (I would vote opt-in)
Is this feature enabled by a vendor extension or a command-line option (or both)?
If it's possible to enable it via vendor extension, how do we ensure this doesn't get enabled for server generation?
In either case, how does this affect people who share the model objects for their server and client?

blast-hardcheese commented 3 years ago

Opt-in, but with as few changes as possible to what happens if the feature is turned on or off, as to not penalize people for changing that.
CLI imo 3/4. I'm really torn here. I think this should be possible for both servers and clients, not just because it would make generation easier, but to maintain parity between the clients and servers. Currently, every feature is supported in clients and servers (with the exception of x-server-raw-response).

What's the downside for opt-in unknown enumeration generation on the server? The op-in-ness should be enough to keep people on the right track unless they open that particular bag of worms, no?

kelnos commented 3 years ago

✅
The big benefit for (also?) allowing enabling via a vendor extension is it allows a spec author to explicitly say "I will probably add other enum values in the future, so you [a client consumer] need to handle this case". A spec consumer may not know about this otherwise, and not know that enabling the CLI flag for this would be a good idea for that particular API.
I guess it's not a big deal if it gets enabled for servers. From the guardrail-is-opinionated perspective, we don't want server implementations to use the unknown type when returning values; they should have all values explicitly listed. If they have runtime-generated values, then they shouldn't be using an enum. But giving the option means that you know someone, somewhere will abuse it. Which we should maybe just accept as a fact of life and ignore.
(This point then becomes irrelevant.)

blast-hardcheese commented 3 years ago

I wonder if it should be enabled by default for clients then? "Hey by the way, you have no choice but to presume the server will change", the downside being if the server accepts enums then the unknown value could be a liability. I thought a private constructor would be useful, but in the case of a legitimate use case requiring a brand new unknown value, the user would have to jump through circe to create one unsafely, so it's undesirable in the end.
Yeah. Escape hatches are the main thing. I'm looking at interop over http as a best effort kinda thing, saying "these are the ones I know about today" is useful. If we don't provide for this, then people will invariably revert to string and value will be lost.

kelnos commented 3 years ago

Actually I did think of a problem for #4 again. If the enum is something that the server also accepts from a client, this would allow clients to put wacky things in there, and then the server would have to explicitly deal with it, rather than the deserializer automatically tossing back a 400.

blast-hardcheese commented 3 years ago

Right, but that's opt-in behavior

guardrail-dev / guardrail

Extensible enums #93