ngandrass / moodle-quiz_archiver

Archives quiz attempts as PDF and HTML files for long-term storage independent of Moodle
GNU General Public License v3.0
5 stars 3 forks source link

GDPR compliance - Missing privacy API class #3

Closed danmarsden closed 7 months ago

danmarsden commented 8 months ago

Moodle uses a privacy API for GDPR compliance to allow plugins to specify how they deal with user data. Your plugin stores user data in a number of tables which will need to be included in the privaci api classes.

Sites that use continuous integration processes or those with GDPR requirements will not be able to use your plugin because Moodle runs unit tests which check to see if all extra plugins include the privacy class.

More information on the privacy class is here: https://moodledev.io/docs/apis/subsystems/privacy

(note - you can't use the null provider class as your plugin includes tables with user-data.)

Not a blocker for plugins db approval.

ngandrass commented 8 months ago

I just took a quick look at the privacy API documentation and the plugin code. The "only" personal data that the plugin stores is located within the generated quiz archive. All other data inside the DB tables does not contain personal information.

Quiz archives include the following personal data:

--

Defining and exporting this data on a per-user basis should be doable. Deletion, however, becomes a little tricky. It would require to retrospectively change the quiz archive which would not only alter its checksum but also invalidate the signature that attests archive integrity and creation date.

A possible workaround would be to use a two-level approach:

  1. Creating an independent compressed archive (e.g. .tar.gz) for every quiz attempt and signing it individually
  2. Creating a big uncompressed archive (e.g. .tar) that contains all previously created attempt archive files to allow easy download as a bundle

This, however, could lead to problems when archiving large quizzes with lots of attempts. Especially the TSP signing of each individual attempt could lead to rate limit problems with the external time stamp authority. Moreover, when used for generating PDFs that can be handed out to students, the person would need to extract not only one but a potentially huge number of archives to access the generated PDFs.

And last but not least: If used for exam data, I'm unsure if a student is entitled to request to delete data that a university/school is legally required to be archived for a specific time period.

--

If I'm not mistaken, plugins must define a way to delete user data if they want to be GDPR compliant. Having no way to differentiate between "deletable" and "non-deletable" data, IMHO, contradicts the goal of this plugin.

As far as I can see, solely implementing the data description and export methods would not suffice CI checks and GDPR sensitive institutions. Can you think of any proper way to resolve this?

danmarsden commented 8 months ago

I don't think you need to deal with the deletion of the archive files - but I think you need to deal with the data in the quiz_archiver_jobs table - as this has a "userid" field it's classed as user data (Moodle also has unit tests that check all tables that have userid fields have a privacy class that covers that table.)

Hopefully that simplifies the work that you need to do?

ngandrass commented 8 months ago

I don't think you need to deal with the deletion of the archive file

Not needing to delete some data from the archives simplifies this A LOT :tada:

--

[...] but I think you need to deal with the data in the quiz_archiver_jobs table - as this has a "userid" field it's classed as user data

You're absolutely right. My initial thoughts focused on the data of quiz attendees but I forgot about the teacher/manager that initiates the archive job. This obviously links the job to the person, which makes it count as personal data.

--

I'll dive a little deeper into the privacy API and prepare a proper integration for the next release :)

ngandrass commented 7 months ago

I implemented the privacy API as discussed with release v0.6.3 :tada:

The whole data that the plugin stores is now defined within its respective privacy API provider class. Users that created an archive (mostly teachers) get their respective metadata from the database returned. Users whose data that is part of an archive (mostly students) receive an individual subset of the stored quiz archive that only contains their personal data. Such individual archives are dynamically created and automatically deleted afterwards.

I guess that should suffice for now. Thanks again for your heads up and suggestions :+1: