eliashassing154 commented 4 years ago

Marketplace (MP) sends connection request to the Student Information System (SIS). After acceptance from SIS administrator, a SIS will share: _students, courses, subjects
MP sends connection request to the Learning Application (LA). After acceptance from LA administrator, an LA will share: catalogue
LA sends connection request to the MP. After acceptance from MP administrator, an MP will share: entitlements
LA sends connection request to the SIS. After acceptance from SIS administrator, a SIS will share: students, courses, subjects
Learning Management System (LMS) sends connection request to the MP. After acceptance from MP administrator, an MP will share: entitlements
LMS sends connection request to the Learning Application (LA). After acceptance from LA administrator, an LA will share: catalogue
LMS sends connection request to the SIS. After acceptance from SIS administrator, a SIS will share: students, courses, subjects

niesink commented 4 years ago

How will the connection request sending party know where to send the request to? Will there be a centralized index/repository of (MP, LA, SIS) parties and their URLs?

Perhaps this is not necessary for the pilot phase, but it seems like there should be something in place when we enter into the production phase with this ecosystem. I fear we'll collectively resort (back) to emailing spreadsheets otherwise.

eliashassing154 commented 4 years ago

I imagine every party will want to host an SDK (developer.publisher.nl) or something. We also discussed having a central directory so you know what parties are involved. Not sure if we made any decisions on that atm. @MarcelUntied ?

MarcelUntied commented 4 years ago

@niesink @eliashassing154 this is the reason why we are researching the applicability of OSR. It is possible that (parts of) OSR can be used as the 'centrol phonebook' for the services provided by the participating partners (MP, LA, SIS)

eliashassing154 commented 4 years ago

so that is covered in issue 11

niesink commented 4 years ago

@MarcelUntied ok, although I think this would require a quite significant addition to the functionality of OSR as its main components (mandates and endpoints) currently only exist within the context of a specific school. But that's a discussion / research probably best held within the context of #11 👍

niesink commented 3 years ago

Some brief notes of a meeting Clifton and I had:

We want to use OAuth were possible
The exchanges involving catalogues (i.e. where the LA is the server) are different from the other scenarios as they do not happen within the context of a specific school. Therefor the setup process is simpler as it only consists of an exchange of credentials/keys/tokens between the LA and MP or LMS.
For the setup involving schools there is an extra step (in addition to the client and server establishing 'trust') where the school authorizes the server to provide their data to the client. We want to try and use the OAuth Authorization Code flow for this setup.
One of the challenges here is that we want the client to include a specific school/digideliveryid in their authorization request. Data minimization and a reduced risk of data leaks are the main reasons for this. Clifton will do some research on possible ways to implement this while staying within the OAuth spec.

The below diagram is a initial sketch of how this might work. It's important to note that the process/flow could also skip the first screen, meaning the user would start at the client/Noordhoff instead of the server/Somtoday. We've left the authentication/login/SSO aspect out of scope for this issue.

niesink commented 3 years ago

After some further internal discussions our proposal would be to use the Authorization Code flow for the setup, followed by the Client Credentials flow for actual use of the API. The reason for using the Client Credentials flow here is that this allows us to be independant of the user account that went through the setup process. This prevents issues when that specific user account is deactivated, e.g. when that person no longer works at the school.

digiDeliveryId in Authorization Request It might be possible to use the state parameter for the purpose of specifying which school/digiDeliveryId the client would like to have access to. It seems like a good idea to still include a random string in the state parameter as well. So for instance it could be formatted as [digiDeliveryId]::[random string]

digiDeliveryId when requesting an access token When requesting an access_token using the Client Credentials flow it's important that the client specifies the school/digideliveryid. We propose using the audience parameter for this (as per this draft and somewhat similar to how it is used here) So in addition to the grant_type, client_id and client_secret the audience parameter would also be mandatory in the body of the POST request made to /token endpoint.

It might be necessary to also support exchanging the authorization_code for an access token, depending on the OAuth implementation/library used, but we're not sure on this. In any case the acquired access token should probably not be used as it's connected to the user account which might be deleted.

niesink commented 3 years ago

What follows is the proposed design and we're of course open to questions and suggestions.

We can distinguish two general types of data exchange:

School specific This is data belonging to a school that is transferred between two parties/applications, a client and a server. An example would be a learning application requesting student data from a SIS. This exchange involves:

trust is established / credentials are exchanged between the client and server
the school authorizes the server to provide their data to the client
the client requests data from the server

Not school specific This is data that is not specific to a certain school. An example would be a market place requesting the catalogue from a learning application. This exchange involves:

trust is established / credentials are exchanged between the client and server
the client requests data from the server

The client is required to specify the school (i.e. digiDeliveryId) at all times, to support dataminimization.

1. Setup of trust / credentials between parties

The data exchange will be OAuth based and mainly use the Client Credentials flow. Therefor the client will have to provide the server with a callback uri and in return will receive a client_id, a client_secret and the urls that should be used for authorization, requesting access tokens and the api itself.

2. School authorization

For this step OAuth's Authorization Code flow is used. The client will initiate an authorization request by redirecting the user to a url like this:

https://connect.somtoday.nl/authorize
?client_id=123456
&response_type=code
&state=digideliveryid16%3A%3Arandom_string
&redirect_uri=https%3A%2F%2Fnoordhoff.nl%2Fauth
&scope=students

Important to note here is state parameter, that contains the digideliveryid of the school involved, followed by two colons and a random string. The server is required to extract the digideliveryid and only allow the user to authorize data exchange for this specific school. This flow will have to be repeated for each school/digideliveryid.

The server provides the user with a UI to grant or deny the authorization. If the authorization is granted the user is redirected to a url such as this:

https://noordhoff.nl/auth
?code=g0ZGZmNjVmOWI
&state=digideliveryid16%3A%3Arandom_string

As per OAuth spec the server is required to return the state parameter identical to how it was sent by the client, in this case including the digideliveryid.

The authorization step is now complete. Note how there is no option/requirement for the returned authorization code to be exchanged for an access token, like it usually would be.

3. Acquiring an access token and use of the API

To request actual data the client needs an access token. This can be acquired using the Client Credentials flow. The reason for using this combination of Authorization Code flow and Client Credentials flow is the AC flow allows us to request authorization from a user in a standardized way, but we don't want the authorization/access to remain tied to that user as their acount might get deleted when they move on to a different job. The client will send a POST request like this:

POST https://api.somtoday.nl/token
grant_type=client_credentials
client_id=123456
client_secret=987654
audience=digideliveryid16

The audience parameter is required and contains the school/digideliveryid. The server will check the authorization provided in the previous step and if everything seems in order will hand out an access token tied/scoped to the provided digideliveryid.

A response to this request might look like this:

{
  "access_token":"MTQ0NjJkZmQ5OTM2NDE1ZTZjNGZmZjI3",
  "token_type":"bearer",
  "expires_in":3600,
  "scope":"students"
}

The resource the client is interested in can now be called using the access token in the Authorization header.

niesink commented 3 years ago

Some notes from a meeting on this topic Marcel, Danny, Edwin, Elias and myself had last Friday:

Regarding the setup between two parties:

When party A wants to use the API provided by party B, B will create a client for A. This client consists of a client_id, a client_secret and the set of scopes that can be used by this client. This set may contain scopes that span across multiple API's. If for instance B provides a Usage API and a Results API a single client may be defined that's allowed to use both API's.
The endpoints/urls for the different API's (for now) will not be maintained in a central registry. Each party should maintain their own list of the urls of the other parties API's.

On authorization by the school:

The school-specific authorization flow as described above always happens within the context of one specific digideliveryid. Meaning if a school uses multiple digideliveryid's the authorization flow will have to be started for each digideliveryid separately. If it turns out in practice this is too much of a burden on the school's administrator we can consider allowing for an array of digideliveryid's, but this greatly introduces complexity, especially in the various non-happy flows.
Which data exchanges involve authorization by the school is not fully clear at this point. The Catalogue API clearly is one that does not involve the school's authorization. The same seems to apply to the Entitlements sent to the Learning Application by the Marketplace. As for the other exchanges some discussion was had and probably needs to continue.

MarcelUntied commented 3 years ago

Edwin wrote on 2 February: Discussion is about the way of triggering the set-up flow. For the poc with hard coded (!?) end=points and Tokens / AccessID by mail not direct a problem. But for the next step it is good that we have one idea about how this should work. We think the best way is that;

1) Service that will Use the information will ask on a public endpoint for give me consent. In this request the ID token and AccessID and end-points (?) are shared (how we must discuss) 2) Service that will deliver the data has to gif “consent” on that request. And shared also ID token, AccessID and end-points 3) Service that will use the information fished the configuration 4) When a service that deliver the data revoke the consent, a notification is send to the other service. This site will Revoke the tokens etc.

Example

1) LMS that Uasge data send request to public endpoint. In this request the ID token; AccessID; Name and end-points 2) School user logon the LA will see a notification in his set-up module. that will deliver the data has to gif “consent” on that request. And shared also ID token, AccessID and end-points 3) Service that will use the information fished the configuration 4) When a service that deliver the data revoke the consent, a notification is send to the other service. This site will Revoke the tokens etc.

My question to you all, can you agree with this setup flow?

MarcelUntied commented 3 years ago

Luke wrote on 3 February: I see a school user is mentioned in the example. Based on this I assume this flow is proposed as an alternative to OAuth and the school authorization flow as described in https://github.com/stichtingsem/pilot-phase/issues/1#issuecomment-749039687 (and not as a way to establish the initial trust between two parties).

If I understand the proposal correctly one of the differences is it uses server-to-server communication instead of redirecting the user, like in the OAuth-based flow. Apart from the benefits of being an established standard (security, readily available libraries/components), which we agreed upon earlier to use as much as possible, I think the redirecting might make the flow easier for the end user.

I find the concept of notifying the client of consent revocation interesting, we might be able to implement this in the webhook-architecture? I.e. allowing a client to register a webhook-url on which they would like to be notified of any consent revocations.

Of course, please correct me if I’m misunderstanding this and I’m looking forward to hearing the views of others on this as well.

MarcelUntied commented 3 years ago

Proposal: https://github.com/stichtingsem/pilot-phase/blob/main/documents/Setups.xlsx

stichtingsem / pilot-phase

Spike - define setup scenarios #1

1. Setup of trust / credentials between parties

2. School authorization

3. Acquiring an access token and use of the API