mccorby / PhotoLabellerServer

Federated Learning: Parameter Server doing aggregation of updates to a model coming from clients participating in a Federated Learning setup. See also the Android application companion at https://github.com/mccorby/PhotoLabeller
MIT License
50 stars 13 forks source link
cifar-10 deeplearning4j dl4j federated-learning kotlin machine-learning

Federated Learning. The server side bit

This project works in conjunction with the Photo Labeller Android app

The server is in charge of the following tasks:

Installation instructions

Use IntelliJ to build the project

The server expects a local.properties file to provide some information

Example:

model_dir = some_directory_in_your_local_machine
# In milliseconds. 24 hours
time_window = 86400000
min_updates = 1000
layer_index = 3

This file must be located in the server module at the same level as the build.gradle file

Training the initial model and testing it

To run the initial model training, use the Main.kt file

The training process expects three arguments:

You can also run a prediction to test your model. The expected arguments are:

Running the Federated Parameter Server

The server is in charge of starting the training rounds and to keep the updates sent by the clients until the shared model has been updated

The training round starts when the server is initialised. In a real life system, the training round would be triggered either automatically or manually by someone

Execute the JobQueueServer main method and the server will be up and running

The REST API is quite simple:

To check that the server is up and running in your local installation, hit this URL

http://localhost:9997/service/federatedservice/available

Flow of an open round

{
    "modelVersion": "round_20181007_080305",
    "startDate": 1538895785617,
    "endDate": 1538982185617,
    "minUpdates": 1000
}

When a client sends the update to the model, this is stored by the server in the current round directory

The update is also cached in memory to help with the post process

A training round can be closed by reaching the minimum number of clients updates or by timing out

Once the round is closed, the updates are processed by using the UpdatesStrategy injected in the server. The current implementation of this strategy performs a Federated Averaging

The model is then updated (actually just the last feature layer) and saved. From this moment this is considered the latest version of the shared model

All client updates are deleted so that no trace of them remains in the server