tamiratGit / FedELM

1 stars 0 forks source link

5. Differential privacy #5

Open akusok opened 10 months ago

akusok commented 10 months ago

This is an optional thing, if we have time. A great use case for the AWS ECR.

So, imagine the data comes over time in small batches, like 1 patient at a time. We want to share this data but also prevent bad actors to reverse-engineer the values of X. Reverse-engineer a single record is much easier.

Approach: generate random fake records, and send a combination of a true record plus a few fake ones. Harder to reverse-engineer, and even if done nobody knows which is the true record and which are fake.

We will test some strategies and compare on the same graphs of model performance vs. data size.

Steps: