Xtra-Computing / FedTree

A tree-based federated learning system (MLSys 2023)
https://fedtree.readthedocs.io/en/latest/index.html
Apache License 2.0
141 stars 40 forks source link

Using a pretrained horizontal FL classification model on the Python interface + Local datasets for FLClassifier? #40

Closed abrahamcanafe closed 2 years ago

abrahamcanafe commented 2 years ago

Is it possible to load a pretrained horizontal FL .model file to an FLClassifier object on the Python interface and then use the predict() method to make new predictions, even if the pretrained model was generated via command line interface?

Alternatively, is there a way to use both the standalone build and the Python interface in order to train a horizontal FL model in which the clients supply their own local datasets? From my understanding of fedtree.py and scikit_fedtree.cpp, it appears that FLClassifier can only simulate FL by partitioning a single dataset, rather than importing separate datasets from the local parties. (Is this correct?)


I am trying to train a horizontal FL classification FedTree model using local datasets supplied by each of the individual parties. I can run the CLI Distributed Horizontal FedTree example (as well as classifier_example.py) just fine, but it's not clear to me how I can make new predictions after loading the .model files with the Python API. When I use the predict() method right after creating an FLClassifier object and loading a model via load_model(), my program immediately terminates (unless I have trained a new model with the fit() method, which would defeat the purpose of what I am trying to do).


Any help would be greatly appreciated!

QinbinLi commented 2 years ago

@abrahamcanafe ,

I have fixed this issue. Now you should be able to use the Python interface to load the model trained with CLI and conduct prediction. Also, we will improve the python interface so that it supports importing separate datasets in the distributed setting in the future. Thanks!

abrahamcanafe commented 2 years ago

@abrahamcanafe ,

I have fixed this issue. Now you should be able to use the Python interface to load the model trained with CLI and conduct prediction. Also, we will improve the python interface so that it supports importing separate datasets in the distributed setting in the future. Thanks!

Thanks for quick response! I was indeed able to use the Python interface to load a model from CLI.