Investigate why E2E test requires 2 workers

jlewi / flaap

Federated Learning and Analytics Protocols

Apache License 2.0

0 stars 0 forks source link

Investigate why E2E test requires 2 workers #20

Open jlewi opened 2 years ago

jlewi commented 2 years ago

The E2E test is currently creating tasks for two different groups. Its not clear why this is because a single worker should be required based on the data being passed to the program.

https://github.com/jlewi/flaap/blob/9af7e29d45e6e79a0b62e19e75c47e56a64c4184/py/flaap/testing/fed_average.py#L104

Creating this issue to investigate it further as it likely indicates a bug due to a misunderstanding of how TFF works.

jlewi commented 2 years ago

I suspect this has something to do with how the stack of executors (e.g. federating and resolving executors) are constructed. We probably need to control how that stack of aggregators is constructed to separate out the executors that should handle work in each SILO vs. intermediary executors handling aggregation of results.

jlewi commented 2 years ago

See this TFF question. https://discuss.tensorflow.org/t/what-does-it-mean-for-a-tff-executor-to-handle-a-cardinality-with-an-integer-greater-than-1/12277

I think this validates the aforementioned hypothesis.