Closed hacql2004 closed 4 years ago
Hi @hacql2004, thanks for your interest!
This project reuses the distributed training algorithms of vanilla XGBoost. In those algorithms, each node exchanges summaries of their data with the others, so perhaps this is sufficient from the perspective of federated learning. We make minor modifications to change the communication pattern, however, so that each node sends the summary to a centralized aggregator, and not to the other nodes.
We are in the process of adding TLS protection for communication between each node and the aggregator. For now, we do not consider stronger guarantees than that (e.g. encrypting the summaries, and aggregating encrypted summaries), but that would be a very interesting feature to add in the future. Happy to collaborate on this, if case you would like to contribute :-) For stronger privacy guarantees, please check our Secure XGBoost project that uses hardware enclaves to keep all the data private at all times.
Thanks for your reply, but I still feel a bit confused here. In your reply mentioned that 'We make minor modifications to change the communication pattern, however, so that each node sends the summary to a centralized aggregator, and not to the other nodes.' Here did you mean all client nodes only communicate with the server node in the original vanilla XGBoost? Or original xgboost doesn't support this only after your modifications.
Because I notice that function checkpoint/lazycheckpoint would backup client node's local data and recover data from other client node once it unexpectedly shutdown. This means there exists direct data transfering between client nodes in original xgboost. Could you explain this further? Thanks.
Here is the related part picked up from rabit official tutorial:
@hacql2004 we designate one node to be the centralized aggregator, and all other nodes to be clients. The aggregator establishes a connection with each client, and each client talks only to the aggregator. In other words, if there are n
clients, the aggregator establishes n
connections, and each client establishes 1 connection (to the aggregator).
You're correct that in vanilla XGBoost, clients may talk to one another. In Federated XGBoost, they cannot. This is exactly the distinction in communication pattern between vanilla and Federated XGBoost.
Here's a diagram of the communication pattern and general (simplified) workflow in Federated XGBoost. Hope this reduces the confusion!
It's clear now, thanks.
Hi, I'm a fan with your mc2 project and glad to see your federated-xgboost upgrading recently. It seems that some portable function(like listen port and aggregator invitation) has been added into new version. I still have some questions with your project.