Open saurav935 opened 1 year ago
Thanks for reaching out @saurav935. Before discussing the solution, could you elaborate more on the requirements?
Is the intent to connect to two FL servers simultaneously? In that case, it might be an option to call start_client
twice from two different threads (since start_client
is blocking).
Thanks for your response @danieljanes. I know that the multithreading approach would work, but I am also exploring an approach without using multithreading. Basically I want to do the federated learning using 3 servers instead of 1 server. Calling start_client
function twice will also do the training twice and I don't want that. Basically I want the 3 servers to send the information, I will perform an operation with the data received from the 3 servers, then do the training once (the handle
function), and send the trained result to all the 3 servers. So, receive from 3 servers, perform an operation, train once, and send to all the 3 servers.
For example:
receive_from_all_3_servers()
perform_operation()
train_once() # handle function
send_to_all_3_servers()
Thanks, that's helpful context. Is this in the context of a research project or is this system intended for production?
For research/prototyping, the multithreading option might still be the easiest way to do it (not the cleanest, just the easiest). You could have all three threads running, they get a message from the server in fit
, they all put their message in a shared data structure and wait until all three messages are available, then only one of them does the training and writes the results back to the shared data structure, and, once the result is available, all three send the result back to their respective server.
Thanks!
I knew that the flow of multithreading would be like that, but I am currently working on how to do it without multithreading.
I tried adding more Join
functions in the transport_pb2_grpc.py
file and it worked.
# Generated by the gRPC Python protocol compiler plugin. DO NOT EDIT!
"""Client and server classes corresponding to protobuf-defined services."""
import grpc
from flwr.proto import transport_pb2 as flwr_dot_proto_dot_transport__pb2
class FlowerServiceStub(object):
"""Missing associated documentation comment in .proto file."""
def __init__(self, channel, channel_1, channel_2):
"""Constructor.
Args:
channel: A grpc.Channel.
"""
# Join method for 1st server
self.Join = channel.stream_stream(
'/flwr.proto.FlowerService/Join',
request_serializer=flwr_dot_proto_dot_transport__pb2.ClientMessage.SerializeToString,
response_deserializer=flwr_dot_proto_dot_transport__pb2.ServerMessage.FromString,
)
# Join method for 2nd server
self.Join_1 = channel_1.stream_stream(
'/flwr.proto.FlowerService/Join',
request_serializer=flwr_dot_proto_dot_transport__pb2.ClientMessage.SerializeToString,
response_deserializer=flwr_dot_proto_dot_transport__pb2.ServerMessage.FromString,
)
# Join method for 3rd server
self.Join_2 = channel_2.stream_stream(
'/flwr.proto.FlowerService/Join',
request_serializer=flwr_dot_proto_dot_transport__pb2.ClientMessage.SerializeToString,
response_deserializer=flwr_dot_proto_dot_transport__pb2.ServerMessage.FromString,
)
class FlowerServiceServicer(object):
"""Missing associated documentation comment in .proto file."""
def Join(self, request_iterator, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def add_FlowerServiceServicer_to_server(servicer, server):
rpc_method_handlers = {
'Join': grpc.stream_stream_rpc_method_handler(
servicer.Join,
request_deserializer=flwr_dot_proto_dot_transport__pb2.ClientMessage.FromString,
response_serializer=flwr_dot_proto_dot_transport__pb2.ServerMessage.SerializeToString,
),
}
generic_handler = grpc.method_handlers_generic_handler(
'flwr.proto.FlowerService', rpc_method_handlers)
server.add_generic_rpc_handlers((generic_handler,))
# This class is part of an EXPERIMENTAL API.
class FlowerService(object):
"""Missing associated documentation comment in .proto file."""
@staticmethod
def Join(request_iterator,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
print("\ninside Join\n")
return grpc.experimental.stream_stream(request_iterator, target, '/flwr.proto.FlowerService/Join',
flwr_dot_proto_dot_transport__pb2.ClientMessage.SerializeToString,
flwr_dot_proto_dot_transport__pb2.ServerMessage.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
I also made changes in the connections.py
file by adding the server addresses of other servers.
Describe the bug
I am working with Flower which is a federated learning framework. In its grpc connection file they are only creating 1 channel whereas I want 2-3 channels. But when I created 1 more channel with server_address
localhost:5040
, the previous channel with server addresslocalhost:8080
is getting overridden. How can I avoid that and use both the channels?Steps/Code to Reproduce
The code to reproduce the error is mentioned above.
Expected Results
I expect to see the connections to both the servers without getting overridden : )
Actual Results
Flower working successfully with multiple servers instead of just 1 server.