mysociety / whatdotheyknow-theme

The Alaveteli theme for WhatDoTheyKnow (UK)
http://www.whatdotheyknow.com/
MIT License
31 stars 26 forks source link

Data collection improvements to enable better transparency reporting #938

Closed RichardTaylor closed 1 year ago

RichardTaylor commented 2 years ago

Work on: "Manual Transparency Report for 2021 Annual Report" mysociety/whatdotheyknow-theme#910 threw up some areas where better data collection would be desirable:

RichardTaylor commented 2 years ago
mdeuk commented 2 years ago

Work on: "Manual Transparency Report for 2021 Annual Report" mysociety/whatdotheyknow-theme#910 threw up some areas where better data collection would be desirable:

* [ ]  Improved record keeping for how cases are closed on the GDPR spreadsheet. This is mainly a policy/practice point on the use of the existing "Decision" and "Erased?" fields.

  * Maybe change "Decision" to "Final decision"?
  * Consider an additional option for the "Erased?" field for cases where some material has been removed.
  * Maybe we don't need both a "Decision" and "Erased" field, but rather just the "Final Decision" field, and the "Done" field records if the decision has been actioned?

For this to work correctly, you'd need a categorisation (which we already do, in the form of selecting the case type), and then a closure reason (e.g. Resolved - comply), followed by a sub-category which confirms what we've done. If we wished to be precise, two sub-categories may be best.

That could look like:

Category: Erasure (Art 17) Closure reason: Resolved closure reason 1: Comply in full closure reason 2: All data removed

That's a very early suggestion, and isn't a final answer! I'd like to give some thought to how we balance the need for improvement, against the need to reduce complexity - automating things would likely help.

The existing setup is heavily bodged from what was there in the beginning, so it doesn't really do what we always need it to do. We do have an issue of metadata overall, and a lack of consistency in terms of audit logs, which is a key thing to have when handling these cases.

Linking to mysociety/whatdotheyknow-private#239 and mysociety/whatdotheyknow-private#238

  • More specific labelling of support correspondence involving requests for user data, and specifically and separately identifying those from the police / law enforcement.

Agreed. The tracker should be setup to track these types of cases (which we've been categorising with code 'LG') - along with service complaints, as they fall under broadly the same handling mechanism. @sallytay do you have any thoughts on this?

Being able to log these consistently will help considerably with our records management and compliance mechanisms, as we'll have everything on a system that we can run reports against so that everything is kept on track.

RichardTaylor commented 2 years ago

The existing setup is heavily bodged from what was there in the beginning

We could go for a fresh start, a new sheet, possibly with a wider scope to cover all takedown requests, requests for user data and complaints?

mdeuk commented 2 years ago

The existing setup is heavily bodged from what was there in the beginning

We could go for a fresh start, a new sheet, possibly with a wider scope to cover all takedown requests, requests for user data and complaints?

Possibly, but we need to think carefully about that.

I have an idea of sorts, I do need to flesh it out a bit though…

RichardTaylor commented 2 years ago

Suggestion from report for data moving forward from @mdeuk

Could we perhaps collect some metadata within Alaveteli when generating a ban - e.g. similar to how we set a prominence reason on a request (a dropdown of pre-defined options, then a freeform text box).

This might allow us to automate production of this statistic with a degree of certainty.

Originally posted by @sallytay in https://github.com/mysociety/whatdotheyknow-theme/issues/925#issuecomment-984443864

RichardTaylor commented 2 years ago

On the subject of better data on why censor rules were put in place:

https://github.com/mysociety/alaveteli/issues/6487 https://github.com/mysociety/alaveteli/issues/4626

sallytay commented 2 years ago

My thoughts on this: More specific labelling of support correspondence involving requests for user data, and specifically and separately identifying those from the police / law enforcement. Agreed. The tracker should be setup to track these types of cases (which we've been categorising with code 'LG') - along with service complaints, as they fall under broadly the same handling mechanism. Being able to log these consistently will help considerably with our records management and compliance mechanisms, as we'll have everything on a system that we can run reports against so that everything is kept on track.

Yes I agree it would be good to track these. I like the Police Request label in the inbox - it might be also be good to have a specific Police Request for User Data label. to ensure we not capturing other police requests at the same time. We could then also add a Request for User Data label as well for other request not made the police?

As the number of cases is pretty low I'm happy to set up a basic tracker, and don't mind taking on the responsibility to log as them we can keep track of the outcomes as well. I'm. not sure it will be as sophisticated as the GDPR tracker but it would deficiently give us the data that was needed for the Transparency Report

Sally

sallytay commented 2 years ago

ICO Correspondence Data:

As well as clearer labelling within the inbox as we can't use the thread count for accurate numbers as the ICO casework systems doesn't seem to using threading.

My plan is to set up a basic tracking spreadsheet that would record, case we report to the ICO along with any instances where we've been reported to the ICO. Again I'm happy to pick up the admin burden of this as ultimately it will save me time when doing next years Transparency Report and may prove useful throughout the year.

Spreadsheet Content would be along the lines of: Date Sent to ICO Date Response Received ICO case reference number Who we have reported Outcome (to include link to decision notice if there is one)

I've added to my next sprint to do this which you can then feedback on, then this can be started in the new year to make sure we have a good set of data for 2022.

Sally

sallytay commented 2 years ago

Update:

i've made two, very basic spreadsheets to help keep a log of ICO referrals by us and Police request for information. https://drive.google.com/drive/folders/1_lrkmO_kRVCh2quNDy0DMrm7UOQpUSEw

I don't think they need to be any more than this at the moment but any suggestions welcomed.

I'm happy to take the responsibility for logging cases, to relieve the admin burden but obviously anyone can add to them.

Next step is to look at the inbox labels and then to work through the other suggestions on this ticket.

Sally

sallytay commented 2 years ago

I'm in the process of breaking these down into separate tickets for different types of tasks. All suggestions from this ticket will be added to the new tickets.

Data Collection Improvements for Transparency Report 2022 - Support Inbox Labelling mysociety/whatdotheyknow-theme#972 https://github.com/mysociety/whatdotheyknow-theme/issues/972

There will also be: GDPR Spreadsheet Improvements System data collection tickets

Sally

sallytay commented 2 years ago

GDPR Improvements transferred to ticket https://github.com/mysociety/whatdotheyknow-theme/issues/974

sallytay commented 2 years ago

System data collection now transferred to a new ticket https://github.com/mysociety/whatdotheyknow-theme/issues/975

mdeuk commented 1 year ago

Related:

HelenWDTK commented 1 year ago

Closing this, as a lot of this has been implemented and tracker issues are logged elsewhere. Specific issues relating to the 2023 report can be noted on #1536