Open akusok opened 10 months ago
Idea: find out how good/bad of a model each client can create, if they don’t share data. This will be the baseline before we look into the Federated ELM. This should help us explain why federated ELM is so good for clients with little data; and actually measure how good it is in terms of accuracy improvement.
Separately for each “client”
test:
for every number of training data points: 10, 20, 30, 40, … build 15 ELMs like this, average their accuracy for every combination of (L2, neurons), then take the parameters with the best accuracy
accuracy = {} for l in (10, 20, 30, 45, 70, 100): for L2 in (1e-2, 1e-3, 1e-4): accuracy[l for run in range(15): accuracy … (rest shown in code notebook)
Find the best ELM parameters for EACH number of training points: 10, 20, 30, 40,… Then use these parameters, and get the plots of performance vs amount of training data
The goal of federated learning is to build a better model by using more data (from other organizations). If the model does not improve with more data, there is no point building a federated learning.
Here we will test how the model performance improves with more data. We test 2 scenarios:
The idea is to start from a small number of training data points, then add more training data and check how it affects the model performance. We can take 50 data samples and add 50 more samples at a time. Two plots: randomly distributed data and non-randomly geographically distributed data.
Steps: