stichtingsem / technology-prototype

Source code for any services or APIs created in the technology track in Summer 2020
6 stars 1 forks source link

Define overall architectural approach for exchanges #7

Open cliftonc opened 4 years ago

cliftonc commented 4 years ago

We need to agree high level principles and approaches for how data is exchanged.

It needs to be: Reliable, Scaleable, Developer Friendly, Secure & Future Proof.

Initial list of exchange patterns:

Type Description  
Notify / Pull Notification via web-hook, followed by pull of data from source.
Publish / Subscribe Publication of feed of changes on source, destination pulls data as needed. Alternate/additive model to Notify / pull via event feed.
Push Direct call of web service on destination from source.  
Pull Direct call of web service on source from destination.  
Browser Exchange via URL data in browser  
cliftonc commented 4 years ago
wvholland commented 4 years ago

Perhaps its obvious for all off you and already taken into account but ... did we also consider resilience as one of the constaints. The whole construction should not collapse if something is not working. We will design it in such a way that we know that at some times things will fail, and still the system keeps working ...

cliftonc commented 4 years ago

The following was notes from a conversation between @mcginkel @EvanderVeen and @cliftonc over the previous 2hrs.

Principles / Areas

Patterns for integration

Approach to identifiers:

Proposal for Exchanges:

Examples from other places to explore

EdwinIddinkGroup commented 4 years ago

Security and data transmission must meet the standards as laid down in Edu-K / Edustandaard. This includes the use of / taking into account:

    • The Edukoppeling transaction standard is part of the reference architecture for education (ROSA). Within which the use of REST and SOAP is guaranteed.
    • Information Security and Privacy ROSA Certification Scheme.
EdwinIddinkGroup commented 4 years ago

Patterns for integration: Push changes

Okay I admit the information about a license changed in a school year but 3x (not yet used, in use and expired / canceled / blocked) and an access link will not change very often either. But in an N on N relationship to external systems, I wonder if the push isn't getting more complicated.

If I now look at the current development around data minimization, in which the surname may only be shown in an application if this is also necessary in relation to the position, for example: entering numbers / absence: two students with the same first name, in the same group only for those two students the last name is shown, then I do not see push in relation to a notification message works. By the way, as an additional requirement for this type of data the information is not aloud be saved (only available during the session and then in memory (?))).

cliftonc commented 4 years ago

Thanks @EdwinIddinkGroup - can you link to the appropriate versions of these so we can all review and understand if / how we would like to incorporate them?

We discussed existing standards, and agreed to look at them related to each exchange and see if they met our principles and were suitable to take forward, so it would be helpful if you can collect and link the ones you feel are relevant.

You are (purposely I assume?) using some strong language - must - can you also share details on why that is? Is there a legal requirement or other binding agreement we need to take into account that the group isn't aware of?

cliftonc commented 4 years ago

Re. The comment on push Vs pull.

There are three elements to the design, a real time web hook, an API that can be polled for events, and the API for a specific data item.

A subscriber may ignore the first two and just call the API when needed (this is your example), but choosing to do this may have impacts on performance or reliability, and be at risk of being rate limited by the provider if this is not expected behaviour.

If we want it to be reliable, you would use all three components.

SOAP isn't designing for the future, we agreed to let it enjoy its retirement, along with files over SFTP.

Re. The comment on data:

We all agree whole heartedly to data minimisation and it will be a fundamental part of the architecture, but perhaps a key element is putting it in the control of the school / customer. For example, using oauth scopes we can ensure that a subscriber can only receive basic data about users (minimal to operate the service), but not address details if they aren't sending physical books.

This is a dramatic improvement over UWLR where a lot of data is pushed to everyone.

What does concern me is making arbitrary decisions like 'product A can never get that data because committee B decided that it wasn't needed for it to function'.

If data is required for a feature that a school is paying for, and the school approves that data exchange (along with any necessary processing agreements), there should be no issues with this. All members in the ecosystem are bound by the same legal requirements related to data protection and privacy, we shouldn't create additional layers over the top.

If it isn't done this way we will stifle innovation in the sector, and for no actual benefit to the school or individual that I can see.

Does this need a specific issue and sub-group to discuss and clarify? Perhaps you can lead it and kick it off with a review of current agreements that will be binding for us all?