HewlettPackard / swarm-learning

A simplified library for decentralized, privacy preserving machine learning
Apache License 2.0
332 stars 100 forks source link
deep-learning distributed-computing distributed-control distributed-ledger machine-learning privacy-by-design privacy-enhancing-technologies

SWARM LEARNING

Product version: 2.2.0

Swarm Learning is a decentralized, privacy-preserving Machine Learning framework. This framework utilizes the computing power at, or near, the distributed data sources to run the Machine Learning algorithms that train the models. It uses the security of a blockchain platform to share learnings with peers in a safe and secure manner. In Swarm Learning, training of the model occurs at the edge, where data is most recent, and where prompt, data-driven decisions are mostly necessary. In this completely decentralized architecture, only the insights learned are shared with the collaborating ML peers, not the raw data. This tremendously enhances data security and privacy.

Swarm Learning nodes works in collaboration with other Swarm Learning nodes in the network. It regularly shares its learnings with the other nodes and incorporates their insights. This process continues until the Swarm Learning nodes train the model to desired state. User can monitor the progress of the current training as shown in the below image. It shows all running Swarm nodes, loss, model metric (for example, accuracy) and overall training progress for each User ML node. On hovering over the "progress bar", one can see the number of completed epochs and the total number of epochs.

Architecture

Swarm Learning framework is made up of various components known as nodes, such as Swarm Learning (SL) nodes, Swarm Network (SN) nodes, Swarm Learning Command Interface (SWCI) nodes, and Swarm Operator (SWOP) nodes. Each node of Swarm Learning is modularized and runs in a separate container. The nodes represent different Swarm Learning functionality and not physical server nodes.

User ML component

User can transform/modify any Keras or PyTorch based ML program that is written using Python3 into a Swarm Learning ML program by making a few simple changes to the model training code by including the SwarmCallback API. For more information, see any of the examples included with the Swarm Learning package.

The transformed user Machine Learning (user ML node) program can be built as a Docker container or can be run on the host.

NOTE: HPE recommends users to build an ML Docker container for easier and automatic deployment.

The ML node is responsible to train and iteratively update the model. For each ML node, there is a corresponding SL node in the Swarm Learning framework, which performs the Swarm training. Each pair of ML and SL nodes must run on the same host. This process continues until the SL nodes train the model to the desired state.

NOTE: All the ML nodes must use the same ML platform either Keras (based on TensorFlow 2 backend) or PyTorch. Using Keras for some and PyTorch for the other nodes is not supported.

Quick Start

  1. Prerequisites for Swarm Learning
  2. Upgrading from earlier versions
  3. Download and setup Swarm Learning using the SLM-UI installer
  4. Execute a simple predefined example - MNIST example
  5. Running MNIST example using SLM-UI
  6. Monitoring & Tracking Swarm Learning training using SLM-UI
  7. Frequently Asked Questions
  8. Troubleshooting
  9. Release Notes
NOTE: **Accessing Hewlett Packard Enterprise Support** clause and **Concurrent swarm training** feature mentioned in the documentation are applicable for enterprise customers ONLY.
NOTE: The examples and scripts that are bundled with the Swarm UI installer **may not be latest**. If there are any issues running it, please use the copy directly from github.

Detailed Documentation

Related Publications, Talks and External References

Acronyms and Abbreviations

Refer to Acronyms and Abbreviations for more information.

Getting in touch

Feedback and questions are appreciated. You can use the issue tracker to report bugs on GitHub. (Or) Join the HPE Developer Slack Workspace and start a discussion in our #hpe-swarm-learning channel.

Contributing

Refer to Contributing for more information.

License

The distribution of Swarm Learning in this repository is for non-commercial and experimental use under this license.

See ATTRIBUTIONS and DATA LICENSE for terms and conditions for using the datasets included in this repository.