Cloud edge model parallel training

skrlin commented 3 years ago

It is unknown whether Sedna supports traditional distributed training methods, such as model parallelism. I can divide the model into different layers and distribute the training tasks of different layers to different edge nodes or central cloud nodes. I wonder if Senda supports it?

JoeyHwong-gk commented 3 years ago

Distributed training falls under the category of machine learning frameworks, so maybe it's depend on the framework you use.

MooreZheng commented 2 years ago

Our team had quite a few discussions on this issue. We understand that traditional distributed learning can help to reduce training time. Sedna would like to support both edge-cloud collaborative training and inference, but more focuses on stand-alone models execution at present. Stand-alone models help to make sure the runtime/ robust service on each node, especially when some nodes are offline, which would be a better choice then model cutting. We can discuss the pros and cons of stand-along models and split models on weekly meeting if interested. Besides, we are very welcome to new proposals in KubeEdge SIG AI.

1. If one is looking for tools As for distributed learning, Sedna supports both edge-cloud collaborative training and inference. For example, you can do it in a way of federated learning, which is intrinsically distributed multi-task learning without too much consideration on privacy.

One can also provide some real-world requirements/ projects to help community members to justify where there are further technique issues to tackle. We are welcome to real-world applications with Senda.

2. If one is planning for a related proposal We also welcome more proposals on distributed learning in Sedna. One can introduce related techniques in routine meetings.

Would you mind introducing more on why Sedna needs this to community members, e.g., applications, requirements, or techniques? Sedna routine meetings run in a weekly manner (every Thursday) with https://zoom.us/my/kubeedge.

Besides, a similar issue is posted recently as #276.

kubeedge / sedna

Cloud edge model parallel training #231