ietf-wg-gnap / gnap-core-protocol

143 stars 26 forks source link

Polymorphism #6

Closed yaronf closed 3 years ago

yaronf commented 3 years ago

The protocol currently uses polymorphism extensively, and there is an ongoing discussion about the costs/benefits, especially as they apply to strongly typed languages (Go is a particularly tough case).

The cost/benefit calculation may be different for client-side vs. AS-side code, where presumably client-side code should be easier to develop.

Mailing list thread: https://mailarchive.ietf.org/arch/msg/txauth/zWcr_GxGZ3etGPQC85dgqXq7Li8/

fimbault commented 3 years ago

I've started to look at a few other projects to check how they handle this kind of issue.

Lessons:

Difference with what we're doing: we'll need to make sure it is really simple for the client side libraries, while the server side implementations might optionally want to check against predefined tests cases.

davidgtonge commented 3 years ago

I think that polymorphism is useful for this protocol but should be used only when required.

Branching based on whether a value is a string or an object is one thing. Branching based on whether a value is a boolean or an integer is more problematic (especially in JavaScript where a lot of code will check if something is "truthy" rather than be the boolean true.

fimbault commented 3 years ago

Branching in the protocol is mostly made based on string or object values. But there are indeed a few places where flags may be used, for instance (on top of my head):

We should try to limit those multi variants to get a simpler request/response scheme, but the downside is that it also limits the flexibility around those corner cases.

jricher commented 3 years ago

I agree that the branches and behavior should always be clear based on the type. Designing around idiomatic truthiness can be problematic, as it's far from consistent across different languages -- empty objects, empty strings, and empty arrays all come to mind as things that aren't always handled how you'd expect. I think a full survey of types variety used in the protocol today would be helpful, and the editors could facilitate that by marking all fields with possible types in the descriptions.

fimbault commented 3 years ago

Update following IETF109. The following questions were asked (beyond our current evaluation on how to best implement polyphormism).

What would we do instead? (my comment: no obvious alternative) – Mutual exclusivity of fields (my comment: not easy to implement) – Type fields for different objects (my comment: not easy to manage) – Container objects for single values (my comment: yes possible)

What’s the complexity cost for the protocol? – Who pays the complexity?

jricher commented 3 years ago

PR #124 enumerates which types are in use for each field in the protocol structure.

aaronpk commented 3 years ago

That PR #124 definitely helps clear things up.

I think we could use some guidance on the style of polymorphism used in this spec however. There's already an appendix on polymorphism, but I found myself wanting a style guide to clarify things like:

fimbault commented 3 years ago

references

There are very distinct cases that relate to references, and which highlight the polymorphic issues that we have to deal with.

  1. user and client reference We can pass by reference instead of passing by value:
    • the client request in 2.3 can either be via an instance_id or a direct string
    • the user request in 2.4 can either be via a sub_id (opaque) or a direct string (then in section 3.5, the response can include a "user_handle" and a "instance_id")

Is it worth putting optionality in the requests? It depends what we want:

  1. key reference Section 7.1.1 (key) : "The means of dereferencing this value are out of scope for this specification" means it's based on some out of band mechanism. Again that's a string vs struct case.

  2. interact ref Here it's used for the continuation flow, so it's just a parameter of the continue method.

  3. resource ref Section 8.1 (resource) : in order to use predefined scopes, but the data here can be unstructured (ex : dolphin metadata).

  4. grant ref Section 2.7 (existing_grant) : here it's just a parameter meant as an immutable reference.

implementation

There are many different ways to do that, more or less generically.

tests

Reference: { user: Subject { id: "123456789", format: None } } Value: { user: Subject { id: "123456789", format: Some("opaque") } }

Simplified code to give the intuition (in practice the deserialize is more complex, you have an array of enums in between):

fn main() {

    let user_string = r#"
        { "user": "123456789"}
    "#;
    let user: User = serde_json::from_str(user_string).unwrap();
    println!("{:?}", user);

    let user_struct = r#"
    {
        "user": {
            "format": "opaque",
            "id": "123456789"
        }  
    }

    "#;
    let user: User = serde_json::from_str(user_struct).unwrap();
    println!("{:?}", user);
}

#[derive(Debug, Deserialize)]
struct User {
    #[serde(deserialize_with = "visitor")]
    user: Subject,
}

#[derive(Debug, Deserialize)]
struct Subject {
    id: String,
    format: Option<String>
}
adeinega commented 3 years ago

@fimbault, regarding

in go: for the most complex cases, one can use map[string]interface{} but: No type checking. If the client sends a int instead of a string for the key, then unmarshalling into a map won’t catch the error. Maps are vague and verbose. After unmarshalling the data, we are forced to use runtime checks to make sure the data we care about exists. If those checks aren’t thorough, it can lead to a nil pointer dereference panic being thrown.

You may use your own (custom) unmarshal function to do all required checks for the type of a value (string vs interface like in your user_string and user_struct). Furthermore, take care of the rest of things like https://github.com/golang/go/issues/14750.

jricher commented 3 years ago

My own Java implementation uses custom serializer/deserializer classes with a wrapper interface for any field that can take different forms. The two main categories we have are "string-or-object" and "array-or-object", and both of those are pretty easy to detect and handle.

https://github.com/bspk/oauth.xyz-java/tree/master/lib/src/main/java/io/bspk/oauth/xyz/json

fimbault commented 3 years ago

In short yes, that solves the issue.

yaronf commented 3 years ago

Furthermore, take care of the rest of things like golang/go#14750.

Thank you @adeinega, I wasn't aware of this Go issue. It's really bad, and quite surprising to see in the Go library.

fimbault commented 3 years ago

@jricher actually there's a bit more to do to match against the various formats in sub_ids (within the SubjectRequest), instead of a basic string (it would be easy if they all had the same content, but that's not the case).

That gives me something like this in my rust implementation:

// here we use a visitor pattern to map the string as an opaque struct
let byref = r#"
        { "user": "u123456789"}
    "#;

Output: { user: Subject { sub_ids: [Opaque(Format { id: "u123456789" })] } }

// here we handle the array, the content depends on the format field
let byval = r#"
    {
        "user": {
            "sub_ids": [ 
                {
                    "format": "opaque",
                    "id": "u123456789"
                },
                {
                    "format": "email",
                    "email": "u123456789@domain.com"
                }
            ]
        }
    }
    "#;

Output: { user: Subject { sub_ids: [Opaque(Format { id: "u123456789" }), Email(Format { email: "u123456789@domain.com" })] } }

fimbault commented 3 years ago

We're now explicit on types (object, string, array) so I think we can close this issue now