mysociety / alaveteli

Provide a Freedom of Information request system for your jurisdiction
https://alaveteli.org
Other
389 stars 195 forks source link

Feature(s) for showing all information related to a user #6267

Open RichardTaylor opened 3 years ago

RichardTaylor commented 3 years ago

Aims:

Feature(s) for showing all information related to a user could take the form of:

Information users don't already have access to includes:

Often there is a little more detail stored on the system and available to administrators than is published to users, or the public, eg. where a full timestamp is held, but it the time is presented in a less detailed, more readable, form on the site. One example is where the year in which a user joined the site is shown on a user's public profile, but the timestamp of registration is actually held.

Related:

  1. Note the existence of the request details pages eg. https://www.whatdotheyknow.com/details/request/examination_of_electronic_device

  2. https://github.com/mysociety/alaveteli/issues/5412 (Option for censor rules to only apply to public presentation of request #5412 )

RichardTaylor commented 3 years ago

Related: "Requests with prominance "requester_only" should appear on requestor's "My requests" page" https://github.com/mysociety/alaveteli/issues/843

RichardTaylor commented 3 years ago

The idea of a .zip download for all a user's request data has been suggested:

click a button and Alaveteli generates a ZIP file with an intelligible folder structure containing the relevant information for each request.

mdeuk commented 3 years ago

The idea of a .zip download for all a user's request data has been suggested:

click a button and Alaveteli generates a ZIP file with an intelligible folder structure containing the relevant information for each request.

Just to add the second part of that quote: "I'd suggest that we supply the correspondence from requests in a formatted PDF, since although the text files we already use are likely to be understood by many users, a formatted reply is likely to be easier to understand - and it's also useful in cases where we have to print a hardcopy. "

At the moment, when we deal with GDPR / Legal matters on WDTK we sometimes have to carry out some manual processing of requests in order to bring them into a structure that can be supplied - this is particularly an issue where censor rules have been used, where content has manually been removed, or where content is hidden in a request (e.g. a prominence reason).

A format that seems to work is giving a copy the requests in a document (which is generated using a database with multiple tables + reporting) and a ZIP containing the attachments. Ideally, we could automate production of this - either at request or user level, perhaps repurposing existing functionality in Alaveteli, meaning there would be less 'manual' effort required.

An alternative - provide the ability for administrators to download user metadata in a machine readable format that can be processed using a reporting frontend - the benefit could be less RSI for the administrator that processes a rights request.

Rough statistics show that we've answered 21 formal Right of Access (SAR) requests on WDTK - however we do answer many more routine questions without keeping a formal record of them.

It takes anywhere from several minutes to several days to put datasets together, filter through them, carry out any redactions, and get it into a format that can be utilised - we're never going to automate the whole process, but implementing some items from this ticket would definitely be helpful, both to WDTK admins and to other Alaveteli installs (particularly in Europe).

mdeuk commented 2 years ago

Just to note, Richard's comment linking this and #6684 together is very helpful.

Having completed another complex RoAR / SAR recently, I'd note that the process was roughly as follows:

  1. Log request on GDPR/IRM Tracker and create folder on GDrive document management system (automated)
  2. Begin documenting data held (WDTKIRMRoARProcessDoc)
  3. Extract a list of all requests from admin interface
  4. Download a copy of all correspondence on the requests:
    • Copy each outgoing message, collecting the subject from the logs
    • Download each incoming message, noting any attachments (extract from the .eml file)
    • Copy any annotations
    • Note any censor rules applied at request or message level (including prominence reasons)
      1. Populate report with details of each request - including all correspondence, relevant censor rules, prominence reasons and so on (WDTKIRMReqExt).
      2. Download an extract of the user's profile on Alaveteli
      3. Download any relevant email correspondence from the team mailbox, lists, and DMS (GDrive)
      4. Populate template WDTKIRMRoARRsp (RoAR / SAR response letter) with our draft response add any relevant files to DMS.
      5. 🛑 Manually review files and add any relevant redactions which might be required.
      6. Extract a copy of the case record from GDPR/IRM Tracker
      7. 🎉Produce final copy of WDTKIRMRoARRsp letter (in PDF format), share files in relevant folder on DMS, and share with user.
      8. 👍Delete shared files after 30 days (or as required)

As you can see, it can be a little bit laborious! We don't always need to follow all of this, it really does depend on the case - but some do require a complete extract, which is a bit laborious to complete, particularly when there are legal timeframes attached.

Things we could, perhaps, do better

Written from a WDTK perspective, this is (perhaps), a list of things we could do to simplify this process. It's definitely WDTK specific, but there are bits which might be useful to other Alaveteli instances (https://github.com/mysociety/whatdotheyknow-private/issues/239)

Why change?

RichardTaylor commented 2 years ago

The ability to make annotations requester_only rather than just hidden might help here, requesters would then be able access their removed annotations. (https://github.com/mysociety/alaveteli/issues/5423)

RichardTaylor commented 2 years ago

While Improve print styles on admin interface for information releases #6684 will help we're still providing data-dumps which will be hard for users to interpret eg. if we include lines like

Idhash 4m23m5n

to many readers that will be nonsense.

The example user admin page PDF at https://github.com/mysociety/alaveteli/pull/6795#issue-1141167893 says:

Post redirects Id Token Uri Post params yaml Created at Updated at Email token Reason params yaml Circumstance

Those are headers for an empty table of "redirects" but that's going be nonsense to recipients.

If the aim of this feature is to help admins prepare Subject Access Responses then some nonsense is fine, we can remove it.

If the aim is to produce documents which users can obtain and understand we've got more work to do.

I'm contemplating a ticket for a guide to interpreting a user-data release, however ideally the presentation would be such that a guide wasn't required.

RichardTaylor commented 2 years ago

I've ticketed: "Produce a guide to released user-data" https://github.com/mysociety/alaveteli/issues/6876

mdeuk commented 2 years ago

If the aim of this feature is to help admins prepare Subject Access Responses then some nonsense is fine, we can remove it.

We shouldn't be removing data - unless there is a good reason to do so (e.g. where we've redacted PII, or sensitive information about our information systems).

I'm contemplating a ticket for a guide to interpreting a user-data release, however ideally the presentation would be such that a guide wasn't required.

Most organisations will attach a document to RoAR / SAR replies which includes standard boilerplate, e.g. what certain acronyms mean and such like. I don't see a reason why this couldn't be produced - but really, that's for the relevant Alaveteli site admin to do, as it needs to be specific to any local circumstances / legislation / practices.

I'd support #6876 - although, I suggest it is moved to the whatdotheyknow-theme repo - as it is specific to the UK and could be addressed by improving our local practices.

RichardTaylor commented 2 years ago

Top level guidance from the UK Information Commissioner on responding to requests from individuals for their personal data says:

"You should provide the information in an accessible, concise and intelligible format."

https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/individual-rights/right-of-access/

Where we can - pointing people to material they can access via their profile when logged in appears to be a really good way of responding to SAR requests (if we can make the information they want accessible via that route), as it presents the material clearly and in context.

In many cases that would hopefully prevent a request, or make a request easy to deal with.

We are allowed to clarify requests, and we don't have to do disproportionate searches for information. https://ico.org.uk/for-organisations/sme-web-hub/frequently-asked-questions/right-of-accesssubject-access-requests-and-other-rights/#wehave

You could argue additional information is held that's not available to a user eg. the full timestamp associated with an action, but it might be disproportionate to obtain and release that for every message, especially if that's not being specifically requested. (If it did get requested a lot we could make it available to users).

garethrees commented 2 years ago

I'd support #6876 - although, I suggest it is moved to the whatdotheyknow-theme repo

I think actually we should build some of this in to the Alaveteli admin interface so that it's included in the print view. That will also have the not insignificant benefit of helping new admins (volunteers in the UK and international site owners) understand how it all works.

RichardTaylor commented 1 year ago

Noting that

was a step forward here which helps with processing subject access requests, and

is also related.

RichardTaylor commented 1 year ago

While this is titled "Feature(s) for showing all information related to a user" another angle/perspective on it is that it could assist with "data portability", making it easier for users to get their personal data, and perhaps data connected to them, such as correspondence and meta-data connected to requests they've made, out of the system so they can use it elsewhere. (The structured data-feeds which are already offered could already be used to help users extract data.)