dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.readthedocs.io/en/stable/
Apache License 2.0
26.09k stars 8.7k forks source link

Horizontal Federated Learning with Secure Features RFC #10170

Closed ZiyueXu77 closed 1 month ago

ZiyueXu77 commented 5 months ago

Motivation

Current XGBoost introduced the support for Federated Learning, both horizontal and vertical. However, their capability in supporting secure features is limited. Based on basic arithmetic operations - addition and multiplication - that is supported by common Homomorphic Encryption schemes (Paillier, BFV/BGV, or CKKS), the current horizontal and vertical pipeline cannot be integrated. The reason is server and/or clients need to perform operations that are not supported by HE schemes, including division and argmax. It will be useful to implement a variation of current horizontal federated learning XGBoost to provide a solution with secure features.

Secure Pattern

Our current horizontal FL design is:

As the local histograms being transmitted across parties (especially via outside communication channels under federated setting), there is a potential concern that the local histogram information can be leaked and learnt by a third party. Hence users could have a need for protecting the local histograms.

There is essentially no major difference between the proposed method and our current HE solution for horizontal deep learning pipelines

Goals

Non-Goals

Assumptions

Same assumptions as our current horizontal federated learning scheme:

Risks

No fundamental risk since we already implemented the functionality of secure vertical XGBoost by adding functions to the XGBoost codebase. Still, care must be taken to not break existing functionality, or make regular training harder.

Design for Encrypted Horizontal Training

With the basic HE operations of addition, a feasible solution can be achieved. Considering the fact that it may not be straightforward to couple AllReduce with a cipher-text addition, we can beak it to two steps: AllGather + cipher-text addition. We will use the processor interface designed and implemented in secure vertical XGBoost for achieving the encryption/aggregation/decryption.

XGBoost Interface

The way processor interface works:

Upon responding to the AllGather call,

  1. Each party send local G/H histograms to interface (by calling a specific function).
  2. Interface process and prepare the buffer, and send to xgboost, which will be forward to local gRPC handler, where encryption will be performed and the encrypted local histograms will be sent to server.
  3. Server collects AND AGGREGATES the global information, send back to local gRPC handlers.
  4. The global histograms will be prcessed and decrypted via interface for each party. Refer to secure vertical XGBoost for details of xgboost-interface communication patterns.

Potentially, there are two options for the global histogram aggregation:

Given the potential concern, the second option is preferred, it will be performed at FL-end (e.g. NVFlare). Therefore, although we call "AllGather", the actual global aggregation has already been performed at server before the AllGather results are received. Interface will provide functionality to properly process the received buffer.

Same as secure vertical xgboost, only encrypted message will go from local to external (controlled by NVFlare), clean text info stays local.

Encryption Scheme:

Implement an alternative vertical scheme will likely have no adverse impact on user-experience, since it only modifies the information flow without changing the base theory. To achieve best efficiency, a proper encryption scheme needs to be selected. Comparing with vertical xgboost using Pailier due to heavy single-number additions (sample-size), horizontal faces light vector additions (party-size). Hence, CKKS is the best option.

Most related existing PRs

The most related PRs to base upon and modify are ones from secure vertical for interface and integration:

Task list to track the progress

ZiyueXu77 commented 1 month ago

Implementation done, close this RFC