Matriculation Management Pseudonymization

upb-uc4 / lagom-core

Repository for the micro service backend using lagom.

Apache License 2.0

6 stars 0 forks source link

Matriculation Management Pseudonymization #279

Closed bastihav closed 3 years ago

bastihav commented 4 years ago

We want private but verifiable data stored on the chain. --> Remove personal info and instead store a hash value (enrollment.id) as user info.

NikoBergemann commented 4 years ago

https://discordapp.com/channels/696667234252357632/721008317693952003/757629029225463998 @david-buderus is this the expected solution? Im still not quite sure about the requirements towards our hlf-api. Do you want to break the "regular" behavior pattern of returning "" as an OK? Wouldn`t calling the getMatriculationData-Method before returning do the same for you? Wrapping it in the hlf-api is possible but I wonder if it is necessary. @bastihav is getting the newly put data an advantage for you or could we perhaps change the API spec as well? Following Uncle Bob's Advice a single method call should only ever do one thing to decrease complexity and ensure maintainability. As such - I believe it would be nice to strictly separate the two calls of "putting info" and "getting info".

bastihav commented 4 years ago

It has an advantage of saving us a whole RTT + transmission time, which could be roughly 300ms (estimate) depending on the proxy we use. This is not a huge problem, but would be a nice to have and is in line with REST API, which leaves it up to the implementation, whether PUT and POST may return the modified resource in the body. If lagom calls getMatriculationData before returning, we should see the same benefit in performance, given that they operate on the same cluster.

NikoBergemann commented 4 years ago

Since we are still performing two transaction on the ledger (which I think will take up most of the time), we might not reap all of those benefits, but I agree that it's worth a shot. If performance is our goal, returning the matriculationData object during the first transaction should be even faster. @matthias-geuchen What's your take on this? Could we refine the "hyperledger_matriculation_api.md" and have the matriculationData-object returned after submitting it on the ledger?

matthias-geuchen commented 4 years ago

Returning the matriculation data directly is not an option with the data structure we currently store on chain, as transaction responses are stored on the chain, i.e. we would have personal data we cannot delete. If we redesign the data structure stored on chain to not include the matriculationId, full name, and birth date in plaintext (i.e. store only some hash as key and do mapping from/to hash to/from matriculationId, full name, and birth date in lagom), we could get rid of the private data collection and return the object directly from a successful put transaction.

NikoBergemann commented 4 years ago

For this redesign we should store the "matriculationStatus" with the enrollment.id. This is fully pseudonymized and verifiable for anyone who knows the hash-function (should be public) and the student data (the student should be able to calculate his own enrollment.id to verify his data)

NikoBergemann commented 4 years ago

Proposal: make this Issue an Epic due to it`s complexity. We need to change the MatriculationData on chain (API, hlf-chaincode, Lagom-MatriuclationService)