aidecentralized / sonar

SONAR - Self-Organizing Network of Aggregated Representations
MIT License
12 stars 32 forks source link

Rewrite MPI Communication utilities #96

Open tremblerz opened 2 weeks ago

tremblerz commented 2 weeks ago

In our MPI implementation node A has to call comm.send and node B has to call comm.recv for a message to be successfully communicated from node A to node B. In contrast, gRPC only requires calling receive from the other node. gRPC does not require send because each node is running a gRPC server on a parallel thread. The gRPC approach is much more scalable because it does not require synchronization between sender and receiver.

Therefore, we need to write an efficient MPI based server which, similar to gRPC, runs on a parallel thread and serves models and tensors when requested by a receiving user.

If it is too much unnecessary effort, then we should also consider dropping MPI entirely.

LukeAnger1 commented 4 days ago

IDK if people r still working on this but I am working on a very similar structure to the grpc ptotcol but for mpi. There are some set values in comm.proto for the grpc protocol. I am favoring reusing this setup for mpi but don't want to fully commit to this if this isn't what is wanted.

Also is this issue resolved from @kathrynle20 PR? If so then I will not work on my implementation.