This feature will eventually add Decision Trees into the PyDP library, along with the necessary control mechanism needed to use it as part of a pipeline.
Is your feature request related to a problem?
No, it is due to a lack of support.
What alternatives have you considered?
No alternatives currently exist.
Are you interested in working on this yourself?
Yes.
Additional Context
Given that scope of this issue is widespread, it will eventually be broken down into smaller issues. Here's an outline of the functionalities to be implemented:
[ ] Base Decision Tree Model
The base Decision Tree model: a vanilla ID3 based decision tree.
ID3 construction algorithm
Differentially Private algorithm: to construct and update the tree in a differentially private way.
References:
Semi-Supervised Learning Approach to Differential Privacy by G. Jagannathan, C. Monteleoni, and K. Pillaipakkamnatt
A Practical Differentially Private Random Decision Tree Classifier by Geetha Jagannathan, Krishnan Pillaipakkamnatt, and Rebecca N. Wright.
Differentially Private Random Decision Forests using Smooth Sensitivity by Sam Fletcher, Md Zahidul Islam
[ ] Differentially Private Bagging
An algorithm to partition data multiple times in order to achieve differentially private subsample-and-aggregate.
References:
Differentially Private Bagging: Improved utility and cheaper privacy than subsample-and-aggregate by James Jordon, Jinsung Yoon, Mihaela van der Schaar
[ ] Support for Horizontally and Vertically Partitioned Data
This will extend the functionality of the Private Decision Tree to be able to work with horizontally distributed data (by means of incremental learning) and vertically distributed data.
Vertically distributed data requires each stakeholder to construct a differentially private decision tree (assuming no overlap in the attributes) and makes a union.
References:
A Practical Differentially Private Random Decision Tree Classifier by Geetha Jagannathan, Krishnan Pillaipakkamnatt, and Rebecca N. Wright.
[ ] Boosted Differential Private ensembles
Add boosting to differentially private decisions trees to enable ensemble formation.
References:
Boosted and Differentially Private Ensembles of Decision Trees by Richard Nock, Wilko Henecka
More details can be found here as well as in the reference papers.
Feature Description
This feature will eventually add Decision Trees into the PyDP library, along with the necessary control mechanism needed to use it as part of a pipeline.
Is your feature request related to a problem?
No, it is due to a lack of support.
What alternatives have you considered?
No alternatives currently exist.
Are you interested in working on this yourself?
Yes.
Additional Context
Given that scope of this issue is widespread, it will eventually be broken down into smaller issues. Here's an outline of the functionalities to be implemented:
[ ] Base Decision Tree Model
The base Decision Tree model: a vanilla ID3 based decision tree.
ID3 construction algorithm
Differentially Private algorithm: to construct and update the tree in a differentially private way.
References:
[ ] Differentially Private Bagging
An algorithm to partition data multiple times in order to achieve differentially private subsample-and-aggregate.
References:
[ ] Support for Horizontally and Vertically Partitioned Data
This will extend the functionality of the Private Decision Tree to be able to work with horizontally distributed data (by means of incremental learning) and vertically distributed data.
Vertically distributed data requires each stakeholder to construct a differentially private decision tree (assuming no overlap in the attributes) and makes a union.
References:
[ ] Boosted Differential Private ensembles
Add boosting to differentially private decisions trees to enable ensemble formation.
References:
More details can be found here as well as in the reference papers.