Benjamin-Lee / deep-rules

Ten Quick Tips for Deep Learning in Biology
https://benjamin-lee.github.io/deep-rules/
Other
226 stars 45 forks source link

Don't share model weights trained on private data. #4

Closed Benjamin-Lee closed 5 years ago

Benjamin-Lee commented 5 years ago

Adversarial learning (not to be confused with generative adversarial networks) is a growing threat. One particular issue is model inversion, which allows the extraction of sensitive data, either with or without access to the weights.

To be safe, unless you know what you are doing, it's probably best not to share weights that have been trained on sensitive data.

cgreene commented 5 years ago

Absolutely worth talking about, particularly in this domain. We can also discuss some techniques, like adding differential privacy during training, that might help mitigate some risks.

Benjamin-Lee commented 5 years ago

@zhampel, @MaAleBarr, @ninalopatina, @ltindall, @mlomnitz, @paulgowdy and the rest of the Cyphercat crew, I'd love to hear your feedback on this rule.

evancofer commented 5 years ago

@brettbj may have some good feedback here as well

This rule is clearly important (hence its inclusion via #61 ). I think it may ease adoption if we also add some actionable advice (e.g. commonly used software libraries, industry standards) in addition to a general precaution.

AlexanderTitus commented 5 years ago

I think model security should highlight at a minimum:

These are both big areas in security and are both gaining traction in the deep learning space. The challenge with obscuring information, however, is a trade-off between training time and privacy/security. We will continue to see these models get better and faster, and this is definitely an important section.

rasbt commented 5 years ago

Another relevant one (although not bio-specific):

fmaguire commented 5 years ago

Seems covered in tip 10 now https://github.com/Benjamin-Lee/deep-rules/blob/master/content/12.privacy.md