kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.5k stars 1.57k forks source link

[new feature] <we added a pipeline for federated learning on Hemodialysis data> #10342

Open 482170765 opened 6 months ago

482170765 commented 6 months ago

Feature Area

Federated learning is crucial for Hemodialysis patient data analysis. Its benefits are two folds. First, it can help to predict abrupt pressure drop which is lethal during Hemodialysis treatment process. Through distributed training, it can use an aggregating strategy to provide a total enhancement of prediction accuracy out of lower prediction accuracy from different clients. The second benefit is its ability to hide patient privacy information from roaming around on the cloud to a single server which is a paradise for hackers. It will be a shining point to be able to create a pipeline specifically focusing on Federated learning applications.

What feature would you like to see?

We have managed to implement the Federated learning on Hemodialysis data using a kubeflow pipeline infrastructure. This repo. make use of several containers for the federated learning, one for a server, the other two for two clients. The communication piepline between server and clients containers is through k8s HTTP service.

The training accuracy of each separate client reaches around 70%, while the fedarated learning accuracy can reach around 90%. The evident of the enhancements can be seen in the readme file of the repo. https://github.com/sefgsefg/Federated-Learning-on-kubeflow/tree/main.

What is the use case or pain point?

Such pipeline can be useful for the kubeflow community. People can reuse the pipeline structure we provided, and dump in new data set to observe its power in enhancing prediction accuracy. The privacy of the data can be preserved automatically.

Is there a workaround currently?

Without using the pipeline process, a user has to construct all necessary procedures of collecting distributed data onto a central server for training and tuning. A lot of labor has to put into the coding process. More over, users data is exposed to the cloud, which can endanger the privacy of the users.


Love this idea? Give it a 👍.

rimolive commented 3 months ago

@482170765 Will you contribute with a pipeline sample?

482170765 commented 3 months ago

Dear Sir:

Yes, I have actually contributed several kubeflow pipeline samples on github. The following list contains different sample repo.s covering codes and features. 1.https://github.com/kubeflow/pipelines/issues/9621, 2. https://github.com/kubeflow/pipelines/issues/9583 3. https://github.com/kubeflow/pipelines/issues/9532 4. https://github.com/kubeflow/pipelines/issues/9533 Please contact me if you have any questions。

Ricardo Martinelli de Oliveira @.***> 於 2024年3月8日 週五 上午3:54寫道:

@482170765 https://github.com/482170765 Will you contribute with a pipeline sample?

— Reply to this email directly, view it on GitHub https://github.com/kubeflow/pipelines/issues/10342#issuecomment-1984318347, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHKUIPS372O6RGXPFEANRFTYXDAWZAVCNFSM6AAAAABA7IXYAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBUGMYTQMZUG4 . You are receiving this because you were mentioned.Message ID: @.***>

-- 石志雄 Chihhsiong Shih

東海大學 資工系 Department of Computer Science, Tunghai University

482170765 commented 3 months ago

Dear Sir:

I have found that the issues I just replied to you have been closed by the group manager. The code actually exists in the repo. Should repost them as new issues? Or just reopen the issues? 1.https://github.com/kubeflow/pipelines/issues/9621, 2. https://github.com/kubeflow/pipelines/issues/9583 3. https://github.com/kubeflow/pipelines/issues/9532 4. https://github.com/kubeflow/pipelines/issues/9533

cheesepuff287 shihc @.***> 於 2024年3月8日 週五 上午11:53寫道:

Dear Sir:

Yes, I have actually contributed several kubeflow pipeline samples on github. The following list contains different sample repo.s covering codes and features. 1.https://github.com/kubeflow/pipelines/issues/9621, 2. https://github.com/kubeflow/pipelines/issues/9583 3. https://github.com/kubeflow/pipelines/issues/9532 4. https://github.com/kubeflow/pipelines/issues/9533 Please contact me if you have any questions。

Ricardo Martinelli de Oliveira @.***> 於 2024年3月8日 週五 上午3:54寫道:

@482170765 https://github.com/482170765 Will you contribute with a pipeline sample?

— Reply to this email directly, view it on GitHub https://github.com/kubeflow/pipelines/issues/10342#issuecomment-1984318347, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHKUIPS372O6RGXPFEANRFTYXDAWZAVCNFSM6AAAAABA7IXYAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBUGMYTQMZUG4 . You are receiving this because you were mentioned.Message ID: @.***>

-- 石志雄 Chihhsiong Shih

東海大學 資工系 Department of Computer Science, Tunghai University

-- 石志雄 Chihhsiong Shih

東海大學 資工系 Department of Computer Science, Tunghai University

rimolive commented 3 months ago

@482170765 Sorry, maybe I wasn't clear in my comment. Are you willing to contribute with a code sample for your proposed pipelines?

The closed issues are due to inactivity or interaction in the issues. If you say you can contribute with code for these samples we can reopen them.

482170765 commented 3 months ago

@rimolive Sure, I will contribute the code samples for pipeline. As a matter of fact, the complete pipeline codes consist of pipeline implementation with a Node-red wrapper. They had existed since Dec. 2023. They have been waiting for reviews ever since.
Please re-open them. Please help review them. Thank you.

482170765 commented 3 months ago

@rimolive May I ask your wish is to see pure kubeflow pipeline code sample or integration of pipeline with Node-red?

482170765 commented 3 months ago

Yes, I can do that.

Ricardo Martinelli de Oliveira @.***> 於 2024年3月8日 週五 下午7:03寫道:

@482170765 https://github.com/482170765 Sorry, maybe I wasn't clear in my comment. Are you willing to contribute with a code sample for your proposed pipelines?

The closed issues are due to inactivity or interaction in the issues. If you say you can contribute with code for these samples we can reopen them.

— Reply to this email directly, view it on GitHub https://github.com/kubeflow/pipelines/issues/10342#issuecomment-1985494362, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHKUIPXFYMBQMJUKC2WMTMLYXGLG7AVCNFSM6AAAAABA7IXYAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBVGQ4TIMZWGI . You are receiving this because you were mentioned.Message ID: @.***>

-- 石志雄 Chihhsiong Shih

東海大學 資工系 Department of Computer Science, Tunghai University

482170765 commented 3 months ago

Dear All:

Sure, I will contribute the code samples for pipeline. As a matter of fact, the complete pipeline codes consist of pipeline implementation with a Node-red wrapper. They had existed since Dec. 2023. They have been waiting for reviews ever since. By the way, would you like to see pure pipeline samples or an integration with Node-red of the pipeline?

Ricardo Martinelli de Oliveira @.***> 於 2024年3月8日 週五 下午7:03寫道:

@482170765 https://github.com/482170765 Sorry, maybe I wasn't clear in my comment. Are you willing to contribute with a code sample for your proposed pipelines?

The closed issues are due to inactivity or interaction in the issues. If you say you can contribute with code for these samples we can reopen them.

— Reply to this email directly, view it on GitHub https://github.com/kubeflow/pipelines/issues/10342#issuecomment-1985494362, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHKUIPXFYMBQMJUKC2WMTMLYXGLG7AVCNFSM6AAAAABA7IXYAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBVGQ4TIMZWGI . You are receiving this because you were mentioned.Message ID: @.***>

-- 石志雄 Chihhsiong Shih

東海大學 資工系 Department of Computer Science, Tunghai University

rimolive commented 3 months ago

My personal opinion is that good samples are the simpler ones to test. Integrating with Node-red might introduce unnecessary complexity.

I'll leave it to you to write the way you think it's better. When we have the first contribution we can evaluate if it's complex or not

482170765 commented 3 months ago

Dear all:

In this case, I will re-organize the code and just keep the pipeline part. For the original issues, please keep it open. I will raise new issues for the pure pipeline part. Thank you. Regards.

Ricardo Martinelli de Oliveira @.***> 於 2024年3月8日 週五 下午10:42寫道:

My personal opinion is that good samples are the simpler ones to test. Integrating with Node-red might introduce unnecessary complexity.

I'll leave it to you to write the way you think it's better. When we have the first contribution we can evaluate if it's complex or not

— Reply to this email directly, view it on GitHub https://github.com/kubeflow/pipelines/issues/10342#issuecomment-1985814090, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHKUIPVENNBRQR5UTWVNS2TYXHE4JAVCNFSM6AAAAABA7IXYAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBVHAYTIMBZGA . You are receiving this because you were mentioned.Message ID: @.***>

-- 石志雄 Chihhsiong Shih

東海大學 資工系 Department of Computer Science, Tunghai University

482170765 commented 2 months ago

@rimolive We have finished a pipeline implementation of horizontal federated learning. A complete set of codes and installation guide is available in the following repo. https://github.com/sefgsefg/Horizontal-Federated-Learning-on-kubeflow. Please help us by giving your suggestions. Should we make a PR of this code?

rimolive commented 2 months ago

A couple things:

As for making a PR, you're free to propose it as part of the KFP samples, I suggest that you copy this example in this path: https://github.com/kubeflow/pipelines/tree/master/samples/contrib

482170765 commented 2 months ago

Dear Oliveira :

Thanks for the suggestions. Will do accordingly.

Ricardo Martinelli de Oliveira @.***> 於 2024年4月11日 週四 下午8:16寫道:

A couple things:

  • Looks like this example is written to run in KFPv1. Can you port the code to use KFPv2?
  • Even though the code is well documented with a README.md file, I think the best way to create a reproducible sample is to write it in a notebook format so the user can follow the explanation and run the code in the same environment.

As for making a PR, you're free to propose it as part of the KFP samples, I suggest that you copy this example in this path: https://github.com/kubeflow/pipelines/tree/master/samples/contrib

— Reply to this email directly, view it on GitHub https://github.com/kubeflow/pipelines/issues/10342#issuecomment-2049566520, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHKUIPVCL2T2U4D5ICQZB53Y4Z5KTAVCNFSM6AAAAABA7IXYAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBZGU3DMNJSGA . You are receiving this because you were mentioned.Message ID: @.***>

-- 石志雄 Chihhsiong Shih

東海大學 資工系 Department of Computer Science, Tunghai University

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.