zama-ai / bounty-program

Zama Bounty Program: Contribute to the FHE space and Zama's open source libraries and get rewarded 💰
https://zama.ai
251 stars 14 forks source link

Add KMeans #72

Closed VieVie31 closed 1 year ago

VieVie31 commented 1 year ago

Zama Bounty Program: Proposition

zama-bot commented 1 year ago

Hello VieVie31,

Thank you for your bounty proposition! Our team will review and add comments in your issue! In the meantime:

  1. Join the FHE.org discord server for any questions (you’ll find a dedicated #zama-bounty-program channel).
  2. Ask questions privately: bounty@zama.ai.

Talk soon,

bcm-at-zama commented 1 year ago

Hello.

Thanks a lot for your interest in what we're doing at Zama, and in particular for our bounty program.

We can't accept your proposal, sorry:

What I would recommend is to have a look to another ML issue, if you're motivated and willing to. This bounty is about inverting matrix, in FHE. At the end, it should be pretty useful for ML, is in python (as bounties in Concrete ML), and is quite challenging too.

Or, in a very different subject, there is a CHES bounty, but here, I guess some knowledge and interest for CHES is a bit mandatory.

VieVie31 commented 1 year ago

Hello @bcm-at-zama

Thank you for your response. I would like to clarify this proposition:

  1. It's intended for inference purposes only.
  2. The algorithm I'm proposing is not KNN, but rather KMeans. These two are entirely different methods. While KNN is supervised and used for classification, KMeans is unsupervised and employed for clustering. Since no clustering algorithm is currently provided in concrete-ml, I thought adding KMeans would be highly valuable for the community.
andrei-stoian-zama commented 1 year ago

@VieVie31 thank you for the clarification on the work that you want to do.

KMeans, in a predictive setting (infer labels for new unseen data), performs KNN between the new data and cluster centers. As we are currently implementing KNN ourselves we are not looking for contributions on this topic.

With respect to the training part of Kmeans - finding clusters - we are not looking for contributions on this either, as we have no use-case for this. The user could simply perform KMeans on their own data themselves, we don't currently see a use case where they would want to out-source this computation to an untrusted server.

Kmeans is also used as a data analysis algorithm but this use case is very similar to the training use case.

If you are interested in a bounty that is similar (an iterative algorithm), I would suggest https://github.com/zama-ai/bounty-program/issues/67

aquint-zama commented 1 year ago

Closed as not selected but thanks for suggesting the idea.