adap / flower

Flower: A Friendly Federated AI Framework
https://flower.ai
Apache License 2.0
5.1k stars 879 forks source link

Benchmarks / Performance #694

Open lucastliu opened 3 years ago

lucastliu commented 3 years ago

Is there any documentation / metrics / benchmarks on Flower's performance?

I'd like to better understand the exact overheads / performance when using Flower. I understand that some metrics will vary depending on exact implementations, but are there any sort of performance analysis for any setups (perhaps one of your example setups?)

Some example points of interest: memory usage, communication times, total latency, etc

In particular I am curious about Raspberry Pi performance, but any sort of data here would be helpful.

tanertopal commented 3 years ago

Hi @lucastliu currently, there is no documentation with regards to performance but we are in the process of creating / setting up baselines. The overhead on the client-side with regards to memory or CPU usage is very low.

When you say communication times and total latency, what do you mean exactly? Do you mean latency overhead of Federated Learning with regards to network communication or the latency overhead the Flower code introduces on the server and client-side (which is very low)?

Would you be interested in creating benchmarks and contributing?

lucastliu commented 3 years ago

Short answer, yes, I am interested in all of this information.

Client side memory and CPU usage is of interest to me - low is great, but I wanted to know if you had any concrete numbers to go off of. (I am trying to push into low-end devices for edge computing, so the specifics of "low" are relevant)

I definitely want to know about network communication. I understand that particular setups will have different results, but I would like to get an idea for, in one round of communication between a client and a server, how much time/resource will I spend doing "Flower code communication setup." (This would not include the main part of sending all the data / weights, just the other parts that Flower is taking care of explicitly).

Again, I recognize that this may vary per device, but even an example (where the parameters / devices / setup is shared) would be useful to go off of.

To give further context: I am trying to get some sort of analysis on an end-to-end Federated Learning system. With specific implementations on the client / server for training (such as TF or PyTorch) there are established ways of characterizing the impact / performance. I would like to use Flower as "the glue" to connect everything, but I would like to fill in the gaps of what effects Flower's contribution has, especially during communication.