kubeflow / examples

A repository to host extended examples and tutorials
Apache License 2.0
1.39k stars 748 forks source link

Federated Learning for Hemodialysis data analysis #1078

Open 482170765 opened 6 months ago

482170765 commented 6 months ago

Federated learning is crucial for Hemodialysis patient data analysis. Its benefits are two folds. First, it can help to predict abrupt pressure drop which is lethal during Hemodialysis treatment process. Through distributed training, it can use an aggregating strategy to provide a total enhancement of prediction accuracy out of lower prediction accuracy from different clients. The second benefit is its ability to hide patient privacy information from roaming around on the cloud to a single server which is a paradise for hackers.

We have managed to implement the Federated learning on Hemodialysis data using a kubeflow pipeline infrastructure. This repo. make use of several containers for the federated learning, one for a server, the other two for two clients. The communication piepline between server and clients containers is through k8s HTTP service.

The training accuracy of each separate client reaches around 70%, while the fedarated learning accuracy can reach around 90%. The evident of the enhancements can be seen in the readme file of the repo. https://github.com/sefgsefg/Federated-Learning-on-kubeflow/tree/main.

Such pipeline can be useful for the kubeflow community. People can reuse the pipeline structure we provided, and dump in new data set to observe its power in enhancing prediction accuracy. The privacy of the data can be preserved automatically.