Xtra-Computing / FedTree

A tree-based federated learning system (MLSys 2023)
https://fedtree.readthedocs.io/en/latest/index.html
Apache License 2.0
140 stars 38 forks source link

Specify Data Partition in Horizontal Federated Tree Training #53

Closed WilliamLindskog closed 1 year ago

WilliamLindskog commented 1 year ago

Hi,

first, thank you for this project!

I am running FedTree on one computer and would like to assign each party (currently 2) one data set (sample of whole data set) to train on. I am running the python code and see that in fedtree.py there is a variable path, however it doesn't seem to be doing anything. How can I achieve this in stand-alone simulation? Is there another way to assign each party their local data set?

Best regards, W

QinbinLi commented 1 year ago

Hi @WilliamLindskog ,

You can specify the local dataset with command-line interface using data parameter, where you can specify the multiple paths to the local datasets separated with comma. For example, if you write a configuration file with the following parameters, it will load local datasets a and b as two parties. Currently the Python interface does not support this feature and can only load a single dataset for partition and simulation. We will add this feature in Python interface later.

data=a,b
n_parties=2
mode=horizontal