palantir / conjure

Strongly typed HTTP/JSON APIs for browsers and microservices
https://palantir.github.io/conjure/
Apache License 2.0
419 stars 66 forks source link

Enum case sensitivity #193

Open iamdanfox opened 5 years ago

iamdanfox commented 5 years ago

Motivation

From discussion on https://github.com/palantir/conjure-java/issues/134, it appears that the case sensitivity of Conjure enums is not defined precisely enough in the wire format spec. @markelliot provided context that early adopters may still be relying on the case insensitivity of enums. This needs to be defined precisely so that new languages (e.g. Rust) can be implemented correctly and precise positive and negative test cases can be added to the conjure-verification server.

Given an enum definition:

Direction:
  values:
  - UP
  - DOWN
  - LEFT
  - RIGHT

Should a message of {"direction": "up"} be accepted by a Conjure server as equivalent to {"direction": "UP"}? Also should this lowercase presentation of "up" be preserved when re-serialized?

(Previously discussed https://github.com/palantir/conjure/issues/184)

Option 1: lenient deserialization, round-trip preserves case

The enum values RIGHT, right, rIgHt should all deserialize without error and be equivalent to a canonical enum value of Direction.RIGHT. Re-serializing these values should produce the same input strings: RIGHT, right and rIgHt.

Clearly this option minimizes the chance of servers rejecting an input. However, it does require all Conjure language implementations to store the original casing information for each enum to enable re-serialization, requiring slightly higher memory overhead for each enum instance. This is not currently supported by conjure-java and would be a significant undertaking to add to conjure-typescript (as typescript enums are currently pure strings and there isn't a convenient way of overriding equality in typescript so that a switch statement would match both rIgHt and RIGHT).

Option 2: lenient deserialization, no round-trip

This is conjure-java's behaviour, as of 2.6.2.

    @Test
    public void testEnumCasingDeserializationInvariantToInputCase() throws Exception {
        assertThat(mapper.readValue("\"ONE\"", EnumExample.class)).isEqualTo(EnumExample.ONE);
        assertThat(mapper.readValue("\"one\"", EnumExample.class)).isEqualTo(EnumExample.ONE);
        assertThat(mapper.readValue("\"onE\"", EnumExample.class)).isEqualTo(EnumExample.ONE);
        assertThat(mapper.readValue("\"oNE\"", EnumExample.class)).isEqualTo(EnumExample.ONE);
    }

    @Test
    public void enumSerializationDoesntPreserveCase() throws Exception {
        assertThat(mapper.writeValueAsString(mapper.readValue("\"one\"", EnumExample.class))).isEqualTo("\"ONE\"");
        assertThat(mapper.writeValueAsString(mapper.readValue("\"ONE\"", EnumExample.class))).isEqualTo("\"ONE\"");
        assertThat(mapper.writeValueAsString(mapper.readValue("\"onE\"", EnumExample.class))).isEqualTo("\"ONE\"");
    }

Notably conjure-typescript does not support this lenient deserialization as there isn't a serialization layer yet (https://github.com/palantir/conjure-typescript/issues/48). As such, I think we can be sure that no conjure-typescript clients tolerate receiving lowercase enums from a server, even if they may somehow still be sending them.

Option 3: strict deserialization

A request containing right would be considered invalid, and would be rejected by the server. Clients could be certain that all enum responses would be uppercase. This minimizes the amount of memory necessary to store enums internally and also makes new language implementations simpler.

Implementation for conjure-java: https://github.com/palantir/conjure-java/pull/143

iamdanfox commented 5 years ago

If we go with the strict deserialization (option 3):

sfackler commented 5 years ago

I'd lean towards specifying option 3, personally. However, it seems reasonable to have conjure-java continue deviating from that behavior until we can be confident that its current case-conversion behavior isn't being used in practice in the field.

carterkozak commented 5 years ago

I'm in favor of option 3 as well -- I don't like complicated things.

markelliot commented 5 years ago

I'm also a fan of simple things -- and believe it's simpler in implementation to say enum values are case-insensitive, that implementations will always naturally produce values in uppercase, than it is to special-case handling in all languages that the deserialized values are uppercase, and may fill an UNKNOWN value.

carterkozak commented 5 years ago

Should we differentiate between unknown and unknowable? The value SOME_ENUM_VALUE could potentially match across api versions (unknown) but Some Enum 💯 value is unknowable and we are confident that it is not, nor ever will be part of the enum.

I would prefer not to deviate handling enum and union key interpretation. It would be odd if one were case senseitive and the other not. Do we have any prior art in the conjure spec for case insensitivity?

iamdanfox commented 5 years ago

I'd like to close this out and go with option (3), as implemented https://github.com/palantir/conjure-java/pull/143 - we've been shoring up the spec throughout the open-sourcing process and I think this is just one more under-defined topic to iron out.

I'm confident that this won't cause lots of disruption for users because TypeScript users have never been able to switch on lowercase enum values, so I don't think they've ever been used.

markelliot commented 5 years ago

Will closing this out also involve updating all implementations to explicitly treat received lowercase values with an explicit, differentiated error from unknown but uppercase values?

sfackler commented 5 years ago

Yeah, I think it makes sense to limit the unknown variant handling to strings which could actually be valid but unknown variants. That's just an ASCII SCREAMING_SNAKE_CASE check, right?

sfackler commented 5 years ago

The same goes for unknown union variant names too, I suppose.

markelliot commented 5 years ago

I don’t think union fields have quite the same set of restrictions. As a result, if we error on casing (vs choosing to make enums case-insensitive), I’m also unsure how we’d cleanly coalesce union fields and enums to have the same casing.


From: Steven Fackler notifications@github.com Sent: Monday, January 14, 2019 9:17 PM To: palantir/conjure Cc: Mark Elliot; Mention Subject: Re: [palantir/conjure] Enum case sensitivity (#193)

The same goes for unknown union variant names too, I suppose.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub [github.com]https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_palantir_conjure_issues_193-23issuecomment-2D454164444&d=DwMCaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=Vp81aOxWZvuVgVK_wp-VF3pIYG92B19LcCw6XKeYC0U&m=oYhXl0Ia97r8mYlsy_I5YTXdfg07pkH3B_aZ9cBI3bs&s=SaLjxFY2Gz4XNs8nMEgp2NLTSTLlS6qenSdt-PrPXEY&e=, or mute the thread [github.com]https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AAp3Qz2fPIvwoHHBGVPa9MRiR-2D8uNOcZks5vDPOzgaJpZM4Z5UMK&d=DwMCaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=Vp81aOxWZvuVgVK_wp-VF3pIYG92B19LcCw6XKeYC0U&m=oYhXl0Ia97r8mYlsy_I5YTXdfg07pkH3B_aZ9cBI3bs&s=tYH7oO4rvzZEJd4WJoRlHnNlGEJpszv09NBDHNqo0wQ&e=.

sfackler commented 5 years ago

Oh yeah - I meant that we should make sure that unknown union variant names are eligible to be union variant names. The check would be for snakeCase for those I believe rather than SCREAMING_SNAKE_CASE.

carterkozak commented 5 years ago

lowerCamelCase

sfackler commented 5 years ago

Er yeah, that >_>

carterkozak commented 5 years ago

I’m also unsure how we’d cleanly coalesce union fields and enums to have the same casing.

We're already blocked from supporting this by existing typescript clients which require exact matches on both enum values and union type names.

iamdanfox commented 5 years ago

For context, @markelliot raised concerns in a slack thread that the newly imposed enum strictness imposes an unfair burden on any existing user-facing services we have that are relying on the old lenient behaviour.

https://github.com/palantir/conjure-java/pull/232 is a feature flag which would allow existing services to keep the old behaviour around.