freedomofpress / securedrop

GitHub repository for the SecureDrop whistleblower platform. Do not submit tips here!
https://securedrop.org/
Other
3.6k stars 686 forks source link

Tool to help journalists analyze and sanitize metadata #543

Closed runasand closed 7 years ago

runasand commented 10 years ago

As @garrettr said in #519 where we decided to remove MAT from SecureDrop: "I think we're going to start building a tool to help journalists analyze and sanitize metadata after 0.3 is released."

diracdeltas commented 9 years ago

MAT is in Tails. Should we just add instructions for journalists to use it? (I haven't tried using it)

psivesely commented 7 years ago

This is something that would still be good to add instructions for--maybe this (2016 Aaron Swartz Day) hackathon someone will do it.

I'm adding the Reading Room label to this one because I believe this is something that could be automatically done by the reading room client. A submission is downloaded, then in a DispVM it is authenticated, decrypted, decompressed, and wiped of metadata in that order.

psivesely commented 7 years ago

See #497. Metadata may be useful for a journalist trying to verify the authenticity of documents. Therefore, automatic stripping of metadata may not be appropriate. Further, it should be noted MAT is not a perfect solution that wipes all metadata. That said, the journalist should use MAT on any appropriate documents if they intend to take them off the airgap to be published.

Closing, because this was implemented at some point (I can't tell when because the migration to .rst from .md).

redshiftzero commented 7 years ago

My understanding was that issue was to create a tool (like MAT, but better!) for journalists to anonymize their documents? Something not to be done automatically but to have installed in Tails along with some written documentation on how to use effectively to keep sources safe.

psivesely commented 7 years ago

Okay, I misunderstood. Going to reopen. Also, have some more thoughts on the matter.

The best tool I can think of to do this, would be to take the Qubes PDF converter idea, and extend it to all photo and document types. Though it's design intention is to take a possibly malicious document, and produce a trusted one with the same contents, I believe it would also do an excellent job removing metadata. ImageMagick may add certain metadata when it re-constitutes the RGB bitmaps into the respective formats, however, this should be more predictable, less important, and easier to scrub. (E.g., ImageMagick might include the time of re-constitution, which is not nearly as bad as leaking the actual document creation time, but should still be removed.)

I've just started today diving into design of the reading room (RR). Here's the workflow I'm imagining for how a journalists removes documents for publication:

I think it's best we stop putting additional burdens on journalists, and adding to our now ~200 pages of documentation. We need to automate as much as possible, and stop relying on journalists as much possible to practice good opsec.

redshiftzero commented 7 years ago

Great workflow for exporting documents @fowlslegs. Also it looks like the developer is not currently maintaining MAT and is recommending not to use it:

screen shot 2016-11-08 at 10 44 07 am
redshiftzero commented 7 years ago

FYI it turns out: 1) qvm-convert-pdf does convert images (to PDFs) as well using DispVMs, though the "convert to trusted PDF" option does not appear unless you add the .PDF suffix to the file 2) there is actually already a variant of this for images qvm-convert-img (not installed by default, but I tried it out and it works great) that you can install in Qubes to go directly from e.g. PNG to trusted PNG using the same opening in a DispVM approach

redshiftzero commented 7 years ago

Just tried to redact a PDF using MAT on Tails 3 and PDFs are no longer supported files due to this bug found last year (and it looks like it's been disabled for a while). However if someone fixed this bug, they would likely become supported again...

redshiftzero commented 7 years ago

We're going to use a Qubes-based strategy to help journalists strip metadata from SecureDrop submissions. Followup: https://github.com/freedomofpress/securedrop-workstation/issues/26