MozillaFoundation / mozfest-program-2017

Mozilla Festival proposals for 2017
https://mozillafestival.org
80 stars 14 forks source link

Data De-Identification #177

Closed mozfest-bot closed 7 years ago

mozfest-bot commented 7 years ago

[ UUID ] d77977c1-9233-407c-8ea4-d7fa93272946

[ Session Name ] Data De-Identification [ Primary Space ] Privacy and Security [ Secondary Space ] Open Innovation

[ Submitter's Name ] Jessica Gallinger [ Submitter's Affiliated Organisation ] Simon Fraser University [ Submitter's Github ] gallingerj

What will happen in your session?

I will deliver a presentation on de-identification to permit sharing data for secondary uses. Attendees will participate in a hands-on de-identification exercise. Next steps will be outlined, including highlighting educational resources that may be useful for data communities and showing template data sharing agreements.

What is the goal or outcome of your session?

Introduce the practice of de-identification and showcase existing public resources to support folks wanting to share data that would otherwise include indirect identifiers that could be used to identify individuals in a dataset.

If your session requires additional materials or electronic equipment, please outline your needs.

Important note about travel: On principle, I attend international conferences remotely because "one transatlantic flight can add as much to your carbon footprint as a typical year's worth of driving" https://www.theguardian.com/environment/blog/2010/sep/09/carbon-emissions-planes-shipping Climate change is an existential threat and avoiding short-term transatlantic flights is consistent with my values.

Time needed

90 mins

bunnybooboo commented 7 years ago

This does sound interesting. Would you be willing to share more detail about the de-identification exercise?

gallingerj commented 7 years ago

I've delivered V1 of the presentation locally but I haven't built the exercise to accompany it yet. I expect to generate/re-purpose a spreadsheet data file with indirect identifiers. Participants will determine the raw data equivalence class size and they will aggregate records to produce de-identified data matching a k-anonymity value.

Equivalence class: All the records that have the same values on the quasi-identifiers. Equivalence class size: The number of records in an equivalence class. (These can change during de-identification). k-anonymity: The most common criterion to protect against re-identification. The size of each equivalence class in the dataset must be at least k.

Emam, K. E., & Arbuckle, L. (2013). Anonymizing Health Data: Case Studies and Methods to Get You Started. O’Reilly Media.

bunnybooboo commented 7 years ago

Thanks for the additional information @gallingerj

Are you able to clarify how you see the session being run remotely? The way you describe it, right now it feels appropriate in the Shed track:

for hands-on making, hacking and prototyping. Sessions in the Shed require attendees to create and build code, objects or crafts

To greater understand our logistics needs, and indeed if it is at all possible to facilitate, if you were remote what specifically would you need from us locally. Eg. 2 way communications over Vidyo, local facilitator to run around with a microphone, screen sharing etc etc.

gallingerj commented 7 years ago

I plan to use 2/3 of the allotted time (60/90 min) for a presentation so that folks are able to complete the subsequent de-identification exercise. I include hands-on exercises in my teaching to improve learning outcomes. It could belong in the Shed track if that's what you have in mind for lessons in that category.

2 way communications over Vidyo, a local facilitator + microphone, and screen sharing all sound excellent. I know it's hard to accommodate remote instruction but I think it's essential for forward thinking groups like the Mozilla Foundation to pioneer this approach. Thank you.

bunnybooboo commented 7 years ago

I'm sorry to have to inform you, your proposal did not make it to our draft P&S space schedule. Unlocking for consideration from other teams.

mozfest-bot commented 7 years ago

Thank you for taking the time to submit a session to MozFest. Due to the high level of submissions, we’re unable to accept all proposals and unfortunately, your session was not part of the final group.

Thank you for taking the time to submit and we will follow up on email very soon.