DISCO is a code-free and installation-free browser platform that allows any non-technical user to collaboratively train machine learning models without sharing any private data.
Currently, we can only train decentralized with the exact number of peers specified in minNbOfParticipants.
No peer can leave because leaving mid training makes the round drop below the min number of participant and fail (have to abort the round)
No more than minNbOfParticipants peers can join one round because the server sends the list of peers as soon as the threshold is reached, so the minNbOfParticipants + 1nth peer joining doesn't get the current round's peer list but is included in the next round's
Essentially, once the peers list has been sent, joining and leaving is not possible anymore.
Solution Implemented
During onRoundBeginCommunication, peers manifest their interest to join the current round: they send a PeerJoinsRound to the server. The server keeps a list of peers wanting to join (but doesn't reply with a peer list as it currently does).
The peers start training locally without the round's peer list
When peers are done training locally, they notify the server with a PeerIsReady, i.e. ready to exchange weight updates. The server waits until all the peers that sent a PeerJoinsRound sends their PeerIsReady and then send the round's peer list.
i. This allows for some time for peers to join the round. To prevent new peers from continually joining and waiting for new peers to be ready, the server can stop including peers in this round as soon as one peer is ready (and include them in the next).
ii. Peers can leave and notify the server before. As long as the peer list hasn't been sent, peers can join and leave without it being a problem.
Upon receiving the peer list, peers establish p2p connection and start exchanging weight updates.
[x] Allow a peer to join a session that has already started.
[x] If the number of peers drops below the minNbOfParticipants threshold, the peers wait for more participants
[x] Peers leaving notifies the server which in turn notifies the remaining peers to wait.
[x] More than exactly minNbOfParticipants peers can participate in the same round (because the PeerJoinsRound step allows some time for participants to join, instead of directly starting the weight update when minNbOfParticipants peers joined)
Refactoring
aggregator.add returns a promise for the aggregated weights (no more need for the perpetual promise loop for the server controller) + new aggregator tests
Closes #718
Decentralized issues
Currently, we can only train decentralized with the exact number of peers specified in
minNbOfParticipants
.minNbOfParticipants
peers can join one round because the server sends the list of peers as soon as the threshold is reached, so theminNbOfParticipants + 1
nth peer joining doesn't get the current round's peer list but is included in the next round's Essentially, once the peers list has been sent, joining and leaving is not possible anymore.Solution Implemented
onRoundBeginCommunication
, peers manifest their interest to join the current round: they send aPeerJoinsRound
to the server. The server keeps a list of peers wanting to join (but doesn't reply with a peer list as it currently does).PeerIsReady
, i.e. ready to exchange weight updates. The server waits until all the peers that sent aPeerJoinsRound
sends theirPeerIsReady
and then send the round's peer list. i. This allows for some time for peers to join the round. To prevent new peers from continually joining and waiting for new peers to be ready, the server can stop including peers in this round as soon as one peer is ready (and include them in the next). ii. Peers can leave and notify the server before. As long as the peer list hasn't been sent, peers can join and leave without it being a problem.minNbOfParticipants
threshold, the peers wait for more participantsminNbOfParticipants
peers can participate in the same round (because thePeerJoinsRound
step allows some time for participants to join, instead of directly starting the weight update whenminNbOfParticipants
peers joined)Refactoring
aggregator.add
returns a promise for the aggregated weights (no more need for the perpetual promise loop for the server controller) + new aggregator testsEventEmitter
trainingInformation.decentralizedSecure
and addaggregationStrategy
('mean' or 'secure')