mysociety / alaveteli

Provide a Freedom of Information request system for your jurisdiction
https://alaveteli.org
Other
389 stars 195 forks source link

Detect and handle Outlook / Exchange "recall" messages automatically #6884

Open mdeuk opened 2 years ago

mdeuk commented 2 years ago

An occasional nuisance on WhatDoTheyKnow, but one with potentially serious ramifications, is the fact that Alaveteli cannot handle "Recall" messages generated by Microsoft Outlook. This isn't a fault from an Alaveteli perspective, as Microsoft's documentation clearly states:

"Message recall is available after you click Send and is available only if both you and the recipient have a Microsoft 365 or Microsoft Exchange email account in the same organization."

The messages generated by recent versions contain a number of distinctive features, which could perhaps allow us to identify them automatically:

(this is based on interpretation of a random selection of WDTK imsgs: 416116, 1516345, 1997880 - it's possible there are localisations of these messages, so that might be worth checking)

Why does this matter?

There is the potential that a "Recall" is being issued as the first stage of trying to contain a data breach - therefore some Information Rights officers seem rather incredulous when advised why their recall didn't work.

The combination of Outlook and Microsoft Exchange is one of the most popular in the public sector - therefore it's reasonable to conclude that a good proportion of public bodies responding to requests are likely to be using it. Unfortunately, it's not always well known that the 'recall' function has limitations - it doesn't always work reliably even within an organisation.

Why is it our problem?

Technically, it could be said it isn't our problem - but, evidence from WDTK does show that we get a good proportion of these messages (3,000+ based on a crude search).

Many of them will be for relatively simplistic things; however, others may very well be because an organisation has had a data breach - these are the ones which typically come to administrators, albeit, sometimes they aren't noted for some time, as the public body has assumed the recall 'worked', and not followed up on it.

Proposal:

We can already bounce messages when certain criteria is met (e.g. requests closed to new correspondence, spam etc), or direct messages to a holding pen. I would suggest that we consider, for risk mitigation, one of the following:

  1. Detect messages and respond with a predefined template indicating that recalling a message doesn't work (perhaps using the form of words from https://github.com/mysociety/whatdotheyknow-theme/issues/1059). Invite the organisation to contact the configured user-support mailbox, or provide a link to a webform
  2. Variation of 1: Detect and bounce the message, perhaps with the above
  3. Detect and direct the message to a holding pen and generate an alert in the user-support mailbox
  4. Generate an alert to the user-support mailbox

The latter options feel like moving closer to 'proactive moderation' - generally the opposite of how we operate, as acting on this would require an administrator to review content and make a conscious decision based only on the information already available in a thread.

I'd therefore be inclined to look at either of the first two options - which reduce the risk to us, as they are automated, and put the onus on the public body getting in contact. We could simplify this workflow by providing a structured webform, which asks for the key information generally required to start looking at a problematic message.

What will this achieve?

Implementing this would allow for de-risking of a known 'pain point', and allow us to demonstrate that we've implemented good technical safeguards by clearly drawing attention to the fact that this doesn't work.

An alternative could be to merely implement a note in the respective help pages on an Alaveteli instance - but whilst this is useful as a general primer of how we handle things, it doesn't handle the immediate message itself - leaving a potential risk of data which has been inappropriately disclosed remaining on-site for longer than it ought to.

mdeuk commented 2 years ago

I've added the label 'data-protection-risk-reduction' as it seems to fit with the theme here.

RichardTaylor commented 2 years ago

See also: Consider practicality of generating alerts when Excel files containing hidden data are released #2663

The feature proposals are similar in that we are seeking to contain data breaches. In both cases there is a question of who any alert should go to - the public body or the site admin team or both.

===

We often look to see how we could deal with something using the existing site code.

On WhatDoTheyKnow the admin team has an alert set up for terms like "Delivery failed". If we wanted to we could easily add "would like to recall the message" to the list of terms we get alerts on. This would create an additional admin job though if we were to then review correspondence threads for data breaches.

==

An alterative approach might be to set up a Twitter account on the model following the model of deletedbyMPs to draw attention to messages on Alaveteli which public bodies are seeking to recall. Any user could already set up an alert for such messages. This might well help identify some newsworthy cases of accidentally sent messages, however it may on balance be irresponsible if it has the effect of drawing more attention to data breaches.

===

Overall I have no problem with proposal one, above: Detect and advise.

I'm also quite sympathetic to the "not our problem" point of view here. It's down to Outlook to advise its users when its features do and don't work.

At WhatDoTheyKnow one of our core principles is seeking to run the service responsibly. Alerting Outlook users to failed message recall attempts would I think on balance be a highly responsible thing to do, but I think it comes into the category of going far above and beyond what it would be reasonable to expect of us. We do go above and beyond in many areas though.

garethrees commented 2 years ago

Just linking this to https://github.com/mysociety/alaveteli/issues/2045 to remind us to consider whether there could/should be any shared technical implementation.

FOIMonkey commented 2 years ago

I've spoken to journalists who search for "would like to recall" on WDTK to look for embarrassing/newsworthy things that public authorities didn't mean to make public.

mdeuk commented 2 years ago

I've spoken to journalists who search for "would like to recall" on WDTK to look for embarrassing/newsworthy things that public authorities didn't mean to make public.

From a transparency perspective, it's definitely a reason I wouldn't wish for it to work in a silent "delete this" manner!

My thinking was that options 1 or 2 would allow for genuine cases to be brought to our attention more swiftly - purely from a data protection perspective. Getting to the stage where the body gets told automatically "what you've tried won't work - here's what you actually should do" feels like a logical step - but, that's likely easier said than done…

WilliamWDTK commented 2 years ago

Am I correct in thinking that the difference between 1 and 2 is that 1 will leave the recall message visible on the request page, but that 2 would prevent it from arriving there?

mdeuk commented 2 years ago

Am I correct in thinking that the difference between 1 and 2 is that 1 will leave the recall message visible on the request page, but that 2 would prevent it from arriving there?

More or less. 2 might be slightly easier to implement - but I think there is a transparency benefit to still publishing the actual recall request itself, so if I were to choose, option 1 would be my preference.

WilliamWDTK commented 2 years ago

Am I correct in thinking that the difference between 1 and 2 is that 1 will leave the recall message visible on the request page, but that 2 would prevent it from arriving there?

More or less. 2 might be slightly easier to implement - but I think there is a transparency benefit to still publishing the actual recall request itself, so if I were to choose, option 1 would be my preference.

Indeed, there is a transparency benefit to knowing how quickly authorities react to data breaches that they make etc.

garethrees commented 2 years ago

I think the only option we should consider here is option 1 – "Hey, looks like you sent us a recall message but we explicitly don't handle that; we publish everything that gets sent to our email address, including this recall request. If that's a problem in this case, [please contact us](). See our [help pages]() for more about WDTK".

I really hate this recall thing in general on the principle that no one other than me should be able to destroy data on my machine.

mdeuk commented 2 years ago

I think the only option we should consider here is option 1

+1, I think this is the best approach.

I hated these messages when I worked with Exchange - it's a feature that is often misunderstood, but it does seem like we could be handling this more effectively. It doesn't mean we'll necessarily accede to the demand - but, it could allow us to quickly contain data breaches, which is very important.