IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
882 stars 494 forks source link

As a Researcher, I want to view differentially private metadata so that I can learn enough to determine whether or not it’s worth my time to go through the IRB process for the raw data. #4234

Closed djbrooke closed 4 years ago

djbrooke commented 7 years ago

Logged in users can use Two Ravens to explore differentially private versions of the metadata for sensitive files.

mheppler commented 7 years ago

NOTE: Backend prerequisite, issue #4230 and frontend prerequisite, issue #4233

Here is an outline the proposed UI changes.

UI IMPACT

Dataset pg/File pg

QUESTIONS

MOCKUPS

psi-explore-terms

psi-privacy-select

psi-explore-info

dlmurphy commented 7 years ago

For reference when we get to this issue:

In #4233, I recommended the following text change for the "Privacy-Preserving Data Preview" modal seen when a depositor clicks to go to the PSI tool to configure PSI for a file:

Privacy-Preserving Data Preview Use the PSI Budgeter tool to create safe, privacy-preserving summary statistics for this data file. The tool protects data using the differential privacy framework. It allows you to introduce just enough noise into your summary statistics to ensure privacy while still allowing a useful (if blurry) window into the contents of your data. Dataverse users will be able to explore a preview of your data without any danger of exposing private information.

At the time we weren't able to implement this change because by that point we'd removed the psi.json file from the Dataverse source code. When the time comes, I'd like this text change implemented.

TaniaSchlatter commented 6 years ago

Thanks for these edits on the text, @dlmurphy. The language needs to be free of any implied promises. Can you take another round taking out words that imply any sort of guarantee?

pdurbin commented 6 years ago

@dlmurphy also, are hyperlinks supported? @matthew-a-dunlap wouldn't know better than I would.

dlmurphy commented 6 years ago

Good call, @TaniaSchlatter. I've edited the message to be less of a guarantee of success.

@pdurbin, I'm not sure whether we can include a hyperlink in this modal. If so, I think it's nice to have, but if we can't include the hyperlink then I'd be comfortable taking it out (it's not essential).

Privacy-Preserving Data Preview Use the PSI Budgeter tool to create privacy-preserving summary statistics for this data file. The tool helps you protect data using the differential privacy framework. It can be used to introduce noise into your summary statistics to help ensure privacy while still allowing a useful (if blurry) window into the contents of your data. Dataverse users will be able to explore a preview of your data that is calibrated to minimize the risk of exposing private information.

matthew-a-dunlap commented 6 years ago

@dlmurphy @pdurbin I'm pretty sure the hyperlink will work as long as we provide the correct html mark-up. I'll take a try at it myself in my local env at some point soon.

matthew-a-dunlap commented 6 years ago

@dlmurphy Confirmed

screen shot 2017-11-28 at 5 33 35 pm

Here is the description section of the psi.json (github won't let me upload .json)

"description": "Use the PSI Budgeter tool to create privacy-preserving summary statistics for this data file. The tool helps you protect data using the <a href=\"https://privacytools.seas.harvard.edu/publications/differential-privacy-primer-non-technical-audience-preliminary-version\">differential privacy</a> framework. It can be used to introduce noise into your summary statistics to help ensure privacy while still allowing a useful (if blurry) window into the contents of your data. Dataverse users will be able to explore a preview of your data that is calibrated to minimize the risk of exposing private information.",

dlmurphy commented 6 years ago

@matthew-a-dunlap Looking great, thanks!

dlmurphy commented 6 years ago

Just discussed with @mheppler and we realized there are a few pretty compelling reasons not to include that "differential privacy" link to the Differential Privacy Primer paper in this message modal. Best option for us here is to take out that link, and instead link to our PSI page in the user guide (which will include a link to that paper). We also need to make sure that this link opens in a new tab.

So the message should actually be:

Privacy-Preserving Data Preview Use the PSI Budgeter tool to create privacy-preserving summary statistics for this data file. The tool helps you protect data using the differential privacy framework. It can be used to introduce noise into your summary statistics to help ensure privacy while still allowing a useful (if blurry) window into the contents of your data. Dataverse users will be able to explore a preview of your data that is calibrated to minimize the risk of exposing private information. For more information, see our Privacy Management user guide.

This can definitely wait until this issue is prioritized, just leaving it here for the future.

djbrooke commented 4 years ago

Closing in favor of #7400