Refactoring the internals of access request

fimbault commented 3 years ago

A few questions/suggestions around section 8 (Resource Access Rights):

type: considering the constraints (no collision), I believe the term namespace would be better suited. Since we explain a URI is prefered in real life scenarii, we might update the examples accordingly
locations singular or plural: what's the rationale for an array of locations? Do we see that as a common scenario (with exactly the same parameters)? Wouldn't a single value fit most cases, as it's always possible to use multiple access tokens (2.1.2) anyways?
actions: seems ok. Probably we should update the definition of right to ability given to a subject to perform a given action on a resource under the control of an RS (instead of operation)
datatypes: seems to me that we should refer to an array of resources (possibly we could then have types and objects inside, but basic strings seem more generic and simpler). Depending on the use case, objects could include identifiers, absolute or relative paths, wildcards and specific semantics.
we're currently missing the possibility to request attributes (cf #233), which would be an array of strings to refer to groups, roles, etc. (again here, strings are enough I think, an implementer could always distinguish using an appropriate scheme, ex r_admin and g_club in the following example if he really wants to). Attributes would be optional.

(I'm not considering here the domain specific extensions, like geolocation, currency, etc. / btw, shouldn't this be in a dedicated subsection and not in every example?)

Example of proposed json for the access request:

{
    "access_token": [
    {
        "label": "fun_1234",
        "access": 
        [
            {
                "namespace": "https://as1.example.dev/media-api/v2/",               
                "location": "https://mediaserver.example.com",  
                "actions": [ "list", "view", "delete" ],
                "attributes": [ "admin", "club" ],
                "resources": [ "photo/*", "video/clip.mp4" ]
            },
            "dolphin"        
        ]
    }, 
    {
        "label": "profit_6789",
        "access": 
        [
            {
                "namespace": "https://as1.example.dev/bank-api/v1/",               
                "location": "https://bankserver.example.com",  
                "actions": [ "check_statement", "withdraw" ],  
                "attributes": [ "accountant", "euro" ],
                "resources": [ "account-42" ]
            },
            "expense"        
        ]

    }
]}

In the second example, we see that attributes could also be useful to the AS to refer to specific parameters (ex: my currency is euro, etc.). So it's more generic than what is currently described in section 8 with the financial-transaction example (which would require a specific understanding of "identifier" (in my example, 'account-42") and "currency" at the base level of the access struct). Of course it would be possible to make a more detailed schema within each field, or thanks to the namespace, but I believe namespace/location/actions/attributes/resources should be the base level, so that we know what to expect (mandatory and optional fields).

We also see the request includes a withdraw, which is probably not a legal action for the accountant to do, so that it would be refused by the AS. As usual, the AS MUST validate the information provided by the client. Maybe the user is not an accountant at all, and so he shouldn't be able to check_statement.

When preparing its response, the AS may also include any ambient condition it knows about (ex: ok you're an accountant, but you can only view amounts below a threshold, or at the end of the month, or made with some specific counterpart, or whatever).

jricher commented 3 years ago

Thanks for raising these issues -- from an editor's standpoint, please note that we are trying to sync discussions about this structure with the RAR extension of OAuth2, including PRs like this: https://github.com/oauthstuff/draft-oauth-rar/pull/66; we don't want the two specs to lose sync. This doesn't mean we can't change them, but we do need to make sure we have that larger conversation if we do want to change.

As an individual: I don't agree with the proposals here in general, but I see where the points coming from. Some thoughts on each bit:

To me, type is a more clear indicator as to what's going on here. namespace implies that it's something that needs to be resolved and understood, like an xmlns directive or a JSONLD context that's meant to be fetched, parsed, and processed by machine code. That's not what's we mean by this field.

I do think we'll have enough cases of multiple locations. We can't assume that an API will be structured in any specific way -- the location could be a root server, it could include a path, it could include subdomain hosts, etc. I'd like to hear more from people with more use cases for this.

I personally struggled with the datatypes label. I think we run the risk of resources running too close to "identifying the resource server" from OAuth2 https://tools.ietf.org/html/rfc8707. I can't think of another label for "the kinds of things that the API has", though. An analogous scope set would be the OIDC scopes of email profile address phone, each of which describe a kind of data being requested at the userinfo endpoint.

The attribute definition feels way too loose to be useful, to me. I don't think we want to encourage API designers to just put things into a dump bucket, and instead we want to encourage domain-specific fields to describe things like this. Even in the examples above -- are these attributes of the user, the requested transaction, the resulting data access, the client software, something else, all of the above? Since it is so easy to create API-specific fields to describe things with actual clear semantics, we should avoid such overly-generic buckets, in my opinion. Also, the proposal in #233 is about "placing things into the access token", which the client instance has no control over since the "contents" of the access token are between the AS and RS. The client has less insight into what the RS needs than the AS does, since the AS needs to communicate it. The access request should instead focus on what the client wants to :do:.

fimbault commented 3 years ago

Thanks for the feedback. RAR alignment aside (which I agree is a topic in itself):

maybe namespace isn't the best (I was using it as in "module namespace" in programming environments, i.e. organize code into logical groups and to prevent name collisions that can occur especially when your code base includes multiple libraries), but type isn't well suited either ;-) Ex in the RAR document, "type": "https://www.someorg.com/payment_initiation" is not exactly intuitive (that's not a type, that's a process). Also, since it is advised to use a URI anyways to disambiguate, I'm not entirely sure this couldn't be used (in some advanced scenarii) to fetch/parse some context, that could very well be a way to extend the request. Even if that's called type.
if we have good reasons (i.e. a recurring need) to get an array for locations, fine. I don't see many on my side, and there's the additional complexity of checking the authorizations for several endpoints, or at least we should include guidance on how this gets processed. A dumb example of risk is that I include a dev and a prod environment, and I just read the first location[0] (say the dev) to process the request, because "it's same same (but different)". I think better to force to always process access request by access request. The main point being that the array already allows multiple access token requests, so I don't see a real rationale for a locations array. The cost is to potentially duplicate the access request parameters, so if that happens a lot in business cases, there might be something to do (passing by reference might help).
the other label I was thinking about was object (instead of resource or datatypes)
regarding attributes, maybe I've gone too far with the second example (and anyway, a request wouldn't have to include all parameters) but we can't limit the request to rights, there are cases where I just want to say, "hey give me an admin token", and some of the actions might actually derive from that.

jricher commented 3 years ago

To the specific push for attributes: To me, the "give me an admin token" is closer to the definition of actions than anything else, since it's aligned with "what I want to do". Maybe instead of attributes it could be rights or privileges? Basically, each field ought to be pretty clear in its intended target, and they should all be as orthogonal to each other as possible.

fimbault commented 3 years ago

IMO:

right is related to actions/operations ("I have the right to read, write, etc.")
privilege is related to attributes ("I am an admin, I'm part of a privileged group, etc.")

The difference I make between the 2 is that anyone can ask for a right (of course, to be granted or not), but a privilege depends on your identity. So a privilege is kind of a reference in itself (some external system, ex your company, maps your admin privilege).

access token was defined as a data artifact representing a set of rights and/or attributes and attribute was defined as characteristics related to a subject (of course we could change if we need to clarify the target, we could say that an access token represents a set of rights or privileges). In that sense, admin would be fine, but currency wouldn't be an attribute/privilege, to reuse the examples above.

jricher commented 3 years ago

I agree that currency isn't an attribute, and that's what lead me to the conclusion that attribute above was becoming a catch-all bucket. I would rather see someone define "currency": "euro" instead of having them try to dump something like that into a "whatever bin".

fimbault commented 3 years ago

Yes. Ideally I'd like my parser to know what to expect at the base level of the access request, and so that would require a predetermined structure that can be typed checked (of course there can be Optional structs and references).

That generally leaves open the question of how to handle extensions in the data model (such as currency, identifier, etc.). The problem I see in the current example with currency is that I don't know how to understand the financial-transaction structure. Unless it's hard coded in some very specific context, but I don't want to have to compile a different lib for each situation (I mean probably I could, but I don't think I should be forced to do it).

So to avoid the whatever bin, we should strictly keep to the definitions, the content of the access request refers to rights/privileges of the user. And most of the debate will be on extensions.
If that's just me, I would probably indeed use the type/namespace to introspect what is required. But the spec would leave open how that mapping gets done.

jricher commented 3 years ago

This is exactly at the heart of many of the discussions in RAR, including (but not limited to) this one: https://github.com/oauthstuff/draft-oauth-rar/issues/45

Basically we can't predict or decide what a given API's security structure will look like at the security level (GNAP). So if there's a definition of the financial-transaction access type, then an AS will probably need to know how to parse and handle that. Or the AS could be "naive" and leave the final processing up to the RS, which will need to understand it. But note that this applies to any type because an AS would need to understand what a set of actions values mean when used together or apart, for example. Since the semantics of ALL structures in the access object are defined by its type, we can either use domain-specific extensions or core parameters with the same level of understanding.

fimbault commented 3 years ago

+1 for the suggested resolution in RAR, if the AS can't understand `"namespace/type": "https://as1.example.dev/bank-api/v1/" (in my previous example) I would reject the request.

adeinega commented 3 years ago

Just thinking out loud.

What makes me wonder when I see access requests is that why GNAP uses "access_token" for access token requests. My best guess is that OAuth uses "access_token" because there is also a "refreshtoken". GNAP doesn't have the latter one, hence, the "access" prefix can be omitted and get a shorter "token".

Does it really need to specify this prefix?

fimbault commented 3 years ago

I guess it's now become a very standard term, and it also qualifies what the token is used for

adeinega commented 3 years ago

I still think it's really unnecessary to use access_token instead of token in the access request for the same reasons you don't use prefix access_ for other properties in it. As I was saying before, it's completely understandable why oAuth has/uses it. It isn't that important thing for me though.

Denisthemalice commented 3 years ago

Switching the meaning of the words back and forth does not al!ow to progress.

access token is defined as a data artifact representing a set of rights and/or attributes attribute is defined as characteristics related to a subject.

It is the time now to define a data structure able to to request attributes within an access token.

fimbault commented 3 years ago

What's most important is that we are consistent. And so we'd need to keep the notions at the same level (rights <-> actions, privileges <-> attributes). If we keep to our definition of attributes (which refer to the subject), there's no real risk of having a bin holder. Otherwise could be called sub_attr if we want to make it more explicit in the label.

Then here's what I suggest:

the definition of access token should talk about rights and privileges
and the request would include lower level constructs, actions and attributes, that the AS uses as inputs to the access token

Denisthemalice commented 3 years ago

@fimbault : What you suggest is rather different from the current definitions. With your proposal, nobody would know anymore what a privilege is.

access token is a data artifact representing a set of rights and/or attributes. rights and/or attributes are privileges.

Using _"subattr" claims is however a nice proposal, since it would be not possible to have confusion with the "sub" claim
as currently known in OAuth.

fimbault commented 3 years ago

Cf above for the distinction I made between rights and privileges, which you also find in plain english.

Updated proposal: (I indicate where there are some question marks with ?)

{
    "access_token": 
    {
        "label": "fun_1234",
        "access": 
        [
            {
                "namespace/type?": "https://as1.example.dev/media-api/v2/",               
                "location(s?)": "https://mediaserver.example.com",  
                "actions": [ "list", "view", "delete" ],
                "sub_attrs?": [ "admin", "club" ],
                "objects?": [ "photo/*", "video/clip.mp4" ]
            },
            "dolphin"        
        ]
    }
}

jricher commented 3 years ago

I think the above conversation is confusing what is represented by the access token and what is returned to the client instance, and that's dangerous. We aren't just talking about protecting APIs that have information about people, and we need to be very careful that we don't make that assumption.

The access token represents rights of access on a protected resource. There might be other aspects of this that the resource server cares about, such as the identity of the subject the resource represents. But also, there might not be a subject at all, or the subject might be irrelevant to the resource being protected (say it's something not related to the RO).

So to me, the kinds of things being described here aren't attributes of the subject at all. They are a kind of additional dimension to the request, but not attributes.

fimbault commented 3 years ago

Ok, but this stems from your remark that attributes were too broad. I can understand that now it's the opposite, but we have to choose.
Note also that this array would be optional (in case it's irrelevant to the specific request).

To me they really are attributes, as part of the access request (whatever the chosen name would be). How else would we be able to deal with roles/groups in the access token?

jricher commented 3 years ago

If we wanted to define something for "roles" or "groups", we could do that explicitly. Or more properly, an API designer could. The core protocol shouldn't try to pre-define all the different kinds of dimensions that API designers could use, and we don't want to give the impression that API designers need to cram their security models into the existing fields. We WANT designers to define their own fields and objects.

The question then becomes whether the core protocol defines the field or not.

The core protocol should only have things with boundaries that are clearly orthogonal to each other but also re-usable and general-purpose. That's the rubric for action, datatype, location, and identifier to date. You can understand those in a LOT of different kinds of APIs. They aren't universal, but they don't really overlap and they aren't overly specific to a style of API.

Two things come to mind for this:

1) Should we formalize our rubric for including a "general purpose" access field? 2) Given what would go into (1), what are the other fields we should add? Do the ones we have already meet those guidelines?

fimbault commented 3 years ago

I haven't described a general purpose field. The case described is for roles/groups, which are very usual use cases, even much more than actions.

The extension examples I've put in example 2 above were directly derived from examples currently in the spec. I personally find currency a fairly strange attribute (why would I enclose that in an authorization?), but since it was there I tried what that could look like. But that's not something I would see as mainstream, even for openbanking scenarii.

OK with the questions.

agropper commented 3 years ago

This may not be the proper place to mention this, but I'm particularly concerned about the use-case where a Resource Server' API is mandated by regulation or federation. This is prominent in the US healthcare use-case where both the endpoint standards and the publication of the endpoints themselves is regulated in order to avoid "information blocking".

In this context, how does the RS specify the standards and version supported by an endpoint and do they have a way to list exceptions or exclusions for that part of the standard that they do not support. In other words, I'm hoping for "type" to state a standard and either a positive or negative subscope of the standard.

fimbault commented 3 years ago

Here type is under the control of the AS, not the RS. Probably this requirement is better suited for the RS part of the document?

jricher commented 3 years ago

@fimbault what I mean by "general purpose field" is exactly what you're describing here, so I think we're actually on the same page -- it's about a field that's re-usable across many different kinds of APIs. What I want to avoid is GNAP trying to define apriori all these different dimensions of access, and so what I'm trying to get at is the question of whether the kind of thing you're describing here is general-enough to merit definition. At the end of the day, whoever defines type gets to define all the fields and their meanings, and that's fully on purpose. It shouldn't have to make sense to anyone but those calling the API.

@agropper RS discovery by the client, particularly versioning and compliance, is very much out of scope for GNAP. That said, I'd anticipate standard API developers to declare a type value that is a stable URL.

fimbault commented 3 years ago

@jricher yes I agree we should only define fields which correspond to a mainstream use case. I believe that's the case here.

Denisthemalice commented 3 years ago

I took the final messages of this thread. How they relate to the topic of this thread which is "Refactoring the internals of access request" is unclear to me.

@fimbault: I fully agree with you when you write:

The case described is for roles/groups, which are very usual use cases, even much more than actions.

@jricher. I also agree with you when you write:

I think we're actually on the same page -- it's about a field that's re-usable across many different kinds of APIs.

Some people might think that all the re-usable attributes across many different kinds of APIs have already been defined, such as by OIDC. Unfortunately, this is not the case.

Let us give some examples:

The attribute with the type "country of citizenship" may be interesting and the end-user should be able to understand it, so that he can decide whether he is willing to release that attribute type (and value) to a RS.

The same applies to a "date of birth" or to the "country of residence".

If such attribute types are not defined elsewhere, we should define them. But in such a case, an independent document would be better, because other standards could refer to it, without referring to the whole GNAP.

jricher commented 3 years ago

Discussed this with the RAR editors, and here's where that stands:

RAR already exists in deployments, meaning a compelling reason for changing a parameter's name or syntax would be needed
type should be kept as it's descriptive enough for what it does
locations should be kept plural as it being singular would limit its use only to a more strict audience restriction, and the strict use is still possible in the current structure
adding a new field for level of access makes sense, I propose we call this access and propose text for both GNAP and RAR

fimbault commented 3 years ago

Seems fine to me.

Denisthemalice commented 3 years ago

@jricher and @fimbault: You appear now to understand each other which is fine.

The four lines from jricher are a bit obscure for me, but seem to be a way forward.

I would be interested to see some text with some examples that explain how groups or roles can be supported, as well as examples where attributes like "country of citizenship", "date of birth" or "country of residence" can be supported.

fimbault commented 3 years ago

group and role would fit into access (based on strings). suggested attributes don't fit very well into that model, because you'd need both an attribute label (ex: "country of residence") and an attribute value (ex: "france"). If I just give a country or a date, it's impossible to know the intended use. That said, you could create residence_france as a kind of group if you really want to. But Justin's argument is to not create a field where people will put everything in it, which is understandable.

jricher commented 3 years ago

Right, or you could just define group or role within your API's type definition, if access doesn't fit well for what you're doing.

I would be happy to have additional examples with API-specific terms in them. What we have right now was pulled largely from the financial examples in RAR and modified from there. So we could easily do something like:

"citizenship": "fr",
"dob": "1970-01-01"

And so on. This would not define citizenship or dob as general-purpose parameters but it would be a good example of how things can work for API designers looking to apply this.

If you have some cohesive examples to add, please suggest them either here or in a PR.

fimbault commented 3 years ago

Yes, that's always a possibility. Of course the AS and RS will need to agree also on what that specific type means. There are lots of such attributes that APIs could use at one point.

agropper commented 3 years ago

How would this work in cases where the RS API is based on a regulated standard data model like FHIR R4 https://www.hl7.org/fhir/ ?

jricher commented 3 years ago

@fimbault The AS and RS would always need to agree on waht a specific type means though. If they don't agree then they don't connect.

@agropper You would want to have a formal definition of what type value to use, what fields it can have, and what those fields mean. A structured data model like FHIR lends itself really well to that, and so you'd see a profile like HEART had previously defined for OAuth and UMA.

agropper commented 3 years ago

@jricher Is there any way we could avoid the profiling issue?

(I was a co-founder of HEART and it had the US regulator and the chair of UMA as co-chairs. I thought we were all set. But nothing useful happened after almost 5 years.)

My point (from a human rights perspective) is that the (US or EU) regulator interacts with the industry standards group (HL7 in this case) as they both do a regulatory capture dance. GNAP should sit out the dance because GNAP is not building anything specific to healthcare vs. finance, etc... Saying that there will be a profiling group like HEART is, in my experience, a non-starter because it just distracts from the regulator-industry "romance".

jricher commented 3 years ago

@agropper Ideally the type definition would be handled by the designers of the API, HL7 in this case. That way the data model, API, and security model all can be defined in an intertwined way. For that to happen, HL7 needs to want to target GNAP as part of its security solution. At the time we founded HEART, the FHIR community had not yet landed on OAuth as part of the security layer, and suggesting to do so was seen as novel and frightening to many.

I agree that GNAP should stay out of that dance for any specific industry, and the best thing it can do is provide a structure to hang those definitions on.

fimbault commented 3 years ago

Then the best is probably to avoid industry specific examples (unlike what RAR is doing), and just explain how types can be used to plug extensible datamodels.

Denisthemalice commented 3 years ago

@aaronpk: In issue https://github.com/ietf-wg-gnap/gnap-resource-servers/issues/9 you wrote:

Closing as a duplicate of ietf-wg-gnap/gnap-core-protocol#244

However the title of this issue was:

Capabilities only (i.e. actions) are supported. ACLs (i.e. attributes associated with a subject) should also be supported

This goal should not be forgotten.

@jricher answered to @fimbault :

The AS and RS would always need to agree on what a specific type means though. If they don't agree then they don't connect.

I have a problem with the always since there exists cases where such an agreement is not needed.

An RS would be able to tell to a client that if needs one or more attributes types (and optionally values) and/or some capability and if the end-user first accepts to request them to a given AS and if AS is able to provide them, then the access token can be generated and is very likely to be accepted by the RS. No agreement between them would then be necessary.

As @jricher proposed as examples:

"citizenship": "fr", "dob": "1970-01-01"jri

The types ctz and dob and much more could be standardized. They are not business specific. For example, a RS could indicate that in order to accept a method, it needs the type ctz or a type with a value : ctz=250.

fimbault commented 3 years ago

Closing following the PR

ietf-wg-gnap / gnap-core-protocol

Refactoring the internals of access request #244