Questions about abstracting client communication from simulation

ewenw commented 2 years ago

Hi, we are interested in building a backend aggregator/coordinator that can eventually support production traffic, using FedScale for simulating device training and traffic. This means we wouldn't be implementing an aggregator.py and running it inside the framework. We would to build our own grpc service (or perhaps REST), in which the messages might be different from FedScale's proto schema.. How difficult would it be for us to leverage FedScale for client simulation, but implementing the communication layer so that it works with our service?

fanlai0990 commented 2 years ago

Hello. We are pretty happy to see that we are on the same page. :) And we are indeed planning these! Thanks for raising it up here.

(1) Decentralized aggregators: we plan to democratize the aggregator across multiple machines for better scalability soon;

(2) Communication layer: we are comfortable to update the primitive protocols too. It would be much better if we can learn more about your needs (e.g., protocol schema), to define more consistent protocols.

To recap, we feel excited to help and come up with a plan together in order to help each other, with more details of your needs. As more advanced FedScale support will be released in H2, we think it is the right time to do so for better compatibility with our future releases.

ewenw commented 2 years ago

Hi @fanlai0990 , glad to hear that! For the communication protocol, we can't provide a schema since it will likely be evolving during development and production, and will likely be different from FedScale's in both schema and possibly protocol (HTTP/REST). However, I can see two ways to make the communication between the FedScale client simulation and our service compatible:

Adding an adapter layer on our service side. This layer essentially forwards FedScale's messages to our service by converting them into a compatible format.
FedScale adds an abstract class for us to implement on the client side, providing handlers for different types of messages. We'll then implement those handlers to communicate with our service using the proper schema. FedScale can provide an example implementation of this abstract class that works with its own GRPC aggregator.

I think option 2) might be more attractive since other users in the future can implement their own client communication methods to make it work with a variety of services. Please let me know what you think. Thanks!

mosharaf commented 2 years ago

I agree that @ewenw's option 2 is more attractive. @fanlai0990 all these are also relevant to the client library architecture we were discussing recently.

SymbioticLab / FedScale

Questions about abstracting client communication from simulation #134