ray-project / rayfed

A multiple parties joint, distributed execution engine based on Ray, to help build your own federated learning frameworks in minutes.
https://rayfed.readthedocs.io
Apache License 2.0
91 stars 20 forks source link

[Feature] send and recv use multiple cpus #131

Open da-niao-dan opened 1 year ago

da-niao-dan commented 1 year ago

The current send and recv is an grpc service. The grpc service limits the use of CPU to 1 core per connection. https://github.com/grpc/grpc-swift/issues/992

image

This limit is by design, because grpc service considers the senario when a servie should handle multiple connections.

However, between two clusters, we may have only a handful connections that each needs to send a large object to the other end. Sometimes we can superposition computation with such communications but sometimes it is hard to do so, particularly when communication time is much longer than computation time.

For example, we have two secretflow clusters, and one needs to send a large ciphertext to another. Using multiple CPUs to do send and receive can really speed up communications in this case.

Can you provide the feature to use multiple cores when doing send and receive?

jovany-wang commented 1 year ago

@NKcqx to triage.

da-niao-dan commented 1 year ago

This feature is somewhat important in secretflow's HEU based algorithms. please help.