We recently published a document that describes a new algorithm: WALR. This is a proposal for how to train a Logistic Regression model in MPC. This is a alternative approach to machine learning that can be viewed as an alternative approach to the "noisy labels" approach we have previously discussed in this group.
With this approach, no individually linkable outputs are required. The algorithm only produces aggregate data which are essentially averages across the entire population. Of course, these are made differentially private as well to provide robust privacy protections.
Logistic Regression is not a very complex or sophisticated type of "machine learning", but it has several major benefits:
Simplicity: Logistic regression is a simple and interpretable algorithm, making it easier to understand and explain. Deep learning models can be highly complex and challenging to interpret.
Computationally Efficient: WALR provides a computationally efficient approach to performing Logistic Regression in MPC. Training a deep learning model requires significant computational resources even when performed in the clear. Training in MPC will likely require orders of magnitude more computational resources.
Smaller Data Volumes Required: Logistic regression works well with smaller datasets, whereas deep learning typically requires large volumes of data to perform effectively. In cases where data is limited, logistic regression may be a more practical choice.
Time
Estimated time by section:
The high level objective, output of WALR, and how to interpret it: 10 minutes.
The derivation of WALR, and how it is implemented in MPC: 10 minutes.
Agenda+: WALR - Weighted Aggregate Logistic Regression
We recently published a document that describes a new algorithm: WALR. This is a proposal for how to train a Logistic Regression model in MPC. This is a alternative approach to machine learning that can be viewed as an alternative approach to the "noisy labels" approach we have previously discussed in this group.
With this approach, no individually linkable outputs are required. The algorithm only produces aggregate data which are essentially averages across the entire population. Of course, these are made differentially private as well to provide robust privacy protections.
Logistic Regression is not a very complex or sophisticated type of "machine learning", but it has several major benefits:
Time
Estimated time by section:
Links