FedML-AI / FedML

FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.
https://TensorOpera.ai
Apache License 2.0
4.11k stars 773 forks source link

[FedML Beehive] No examples/edge SDK for embedded Linux #609

Open AIWintermuteAI opened 1 year ago

AIWintermuteAI commented 1 year ago

It is mentioned in the README, that there is edge SDK for embedded Linux for cross-device training (FedML Beehive). https://github.com/FedML-AI/FedML/blob/956639046be40ba45e6a117273a3e51f117961ce/README.md?plain=1#L59

However from what I can see it is nowhere to be found. The cross-device code only contains the server part in Python https://github.com/FedML-AI/FedML/tree/master/python/fedml/cross_device and the client part is only implemented as an Android app.

The reason I'm pointing to that is because I'm working on FL case, that involves Raspberry Pi edge nodes and I'd like to utilize Beehive API for that. I am currently working on implementing client side in Python myself - I'll contribute back later. But if there is something available that I can build on - let me know.

chaoyanghe commented 1 year ago

@AIWintermuteAI for embedded Linux, we view them as IoT devices, please check this example:

https://github.com/FedML-AI/FedML/tree/master/iot

-- I think the doc is somehow misleading. I will update it accordingly. Thanks for your feedback.

AIWintermuteAI commented 1 year ago

Yes, I saw these examples. They however use FedML Octopus, i.e. cross-silo training mode as shown here https://github.com/FedML-AI/FedML/blob/48bf47949f88247f848a7c1c3d122d3a70781a27/iot/anomaly_detection_for_cybersecurity/config/fedml_config.yaml#L2

chaoyanghe commented 1 year ago

@AIWintermuteAI yeah, this works for IoT devices such as RPI. We still view them as cross-silo because the IoT device is always online, 1) not the case such as smartphones which are intermittently connected to the cloud 2) do not need to run client sampling to select devices-per-round from a large-scale of devices.

Let me know more details for your scenario, I will think about whether we need to support more advanced features.