Federated learning must hide the original data, but enable model training. There must be no way for someone to reverse-engineer what the data values are. We will try many ways of doing it, and the effect they have on model performance.
First way is to add noise to H. With a non-linear function, reverse-engineering H to X creates large errors in the values of X. With added noise, we hope to create such large errors that the reverse-engineering is basically useless. But the noise should be small enough to keep model performance at good level.
Steps:
[ ] (@tamiratGit) Repeat step #2 but add a random noise to H. Random noise are Gaussian (normally distributed) random numbers multiplied by a scaling parameter that is small: 1E-6, 1E-5, 1E-4 etc.
[ ] (@tamiratGit) For the graph of model performance vs. data size from #2, add extra lines of model performance vs. size for different noise scale. The model performance should be very similar with a small noise, and completely destroyed with a very large scale noise.
[ ] Find the largest noise scale that has little effect on model performance. We will use this value in the experiments.
Federated learning must hide the original data, but enable model training. There must be no way for someone to reverse-engineer what the data values are. We will try many ways of doing it, and the effect they have on model performance.
First way is to add noise to H. With a non-linear function, reverse-engineering H to X creates large errors in the values of X. With added noise, we hope to create such large errors that the reverse-engineering is basically useless. But the noise should be small enough to keep model performance at good level.
Steps: