FedML-AI / FedML

FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.
https://TensorOpera.ai
Apache License 2.0
4.14k stars 777 forks source link

How to customize the federated learning algorithm? #482

Open Alan-JW opened 2 years ago

Alan-JW commented 2 years ago

Hi @chaoyanghe, If I want to customize the federated learning algorithm in simulated version instead of using the existing algorithms like fedavg, fednova, etc., what should I do? t=Thanks for your answer !

chaoyanghe commented 2 years ago

@Alan-JW Here is a blog introducing our customization APIs: https://medium.com/@FedML/fedml-releases-simple-and-flexible-apis-boosting-innovation-in-algorithm-and-system-optimization-b21c2f4b88c8

You can also find an example at: https://github.com/FedML-AI/FedML/tree/master/python/examples/cross_silo/mpi_customized_fedavg_mnist_lr_example

And many examples using customization APIs at: https://github.com/FedML-AI/FedML/tree/master/python/app

If customized trainer and aggregator cannot meet your requirements, you can use FedML FLow API. Here is an example: https://github.com/FedML-AI/FedML/blob/master/python/fedml/core/distributed/flow/test_fedml_flow.py

Alan-JW commented 2 years ago

Thanks, i got it!

jlewi commented 2 years ago

@chaoyanghe How do you author new algorithms to be cross-compiled to different platforms e.g. Android & IOS?

The Blog Post seems to indicate you can subclass FedMLExecutor and write arbitrary Python code to be executed on the nodes. Presumably this works in situations where each node is capable of running arbitrary Python code.

I assume this is not the case on mobile. The diagram in the blog post indicates the on device training engine is PyTorch or TensorFlow. So is the idea to use Python code to construct a computation graph using either PyTorch or TensorFlow and then export it to a PyTorch/TensorFlow which can be executed using the corresponding engine?

How does one author FedML programs that can be compiled into PyTorch/TensorFlow graphs?

How does MobileNN fit in here? Is MobileNN an alternative to using PyTorch/TensorFlow as an engine? Or is MobileMNN the actual engine?

chaoyanghe commented 2 years ago

@jlewi this is a great question. You can find our engine architecture here: https://github.com/FedML-AI/FedML/tree/master/android.

For Android, we've developed our engine on MNN and PyTorch Mobile. The MobileNN is an adaptor to adapt our APIs to different mobile engine. Based on MobileNN, the Java SDK is the key part of FL local training, related communication protocol, etc.

Previously, we only provide Android SDK. Given that many people ask details for this part, we plan to release all source code soon.

chaoyanghe commented 2 years ago

So is the idea to use Python code to construct a computation graph using either PyTorch or TensorFlow and then export it to a PyTorch/TensorFlow which can be executed using the corresponding engine?

Yes. We can allow users to define the model with python code and distribute the computational graph to mobile devices. By this way, people don't need to handle mobile programming (Java, NDK, C++, etc.)

SichangHe commented 1 year ago

So is the idea to use Python code to construct a computation graph using either PyTorch or TensorFlow and then export it to a PyTorch/TensorFlow which can be executed using the corresponding engine?

Yes. We can allow users to define the model with python code and distribute the computational graph to mobile devices. By this way, people don't need to handle mobile programming (Java, NDK, C++, etc.)

How, exactly? What is the process of adapting a model in Python to Android/iOS like? Could you provide more details?

Thanks.