Closed saeyoon17 closed 5 months ago
@socathie Would you kindly review this proposal
@saeyoon17 Thank you for your proposal. Your previous work on torch2circom shows that you are a good fit for this project. However, I'm worried that the Iris dataset is too low-dimensional (only 4 features) for the comparison/benchmarking to be meaningful. Hence, may I suggest some possible modifications:
On the other hand, the deliverables will need to be more well-defined and details. Here is an example I had from when I did the grant on circomlib-ml and ZKaggle:
Milestone 1 Full-feature circomlib-ml Deliverables: 0a. Documentation - We will provide both inline documentation of the code and a basic tutorial that explains how a user can (for example) spin up the application. 0b. Testing Guide - The code will have proper unit-test coverage (e.g. 90%) to ensure functionality and robustness. In the guide we will describe how to run these tests
- Functionality: Full strides compatibility in current layers - We will rewrite some current templates in circomlib-ml, e.g. adding strides compatibility to Conv2D, so that they will be fully compatible with current tensorflow standards
- Functionality: Flatten - We will write a circom template that will flatten a multidimensional input into a one-dimensional vector.
- Functionality: Dropout/Normalization - Dropout (and other regularization layers such as batch normalization) is one of the most common layers used in SOTA neural networks. Adding them will make the library more complete
- Functionality: Encrypt/decrypt - ECDH encryption and decryption templates will be added to circomlib-ml to enable encryption of model weights in further applications.
- BONUS Functionality: Proof aggregation - We will explore the possibility of aggregating multiple evaluation proofs into one using the recent zkPairing development.
- Application - All newly added templates will come together to form a more accurate model on the MNIST dataset than the current one hosted on https://zk-ml.netlify.app/
Of course, given the scope of your proposal, your deliverables will be very different. This is just to give an idea of the level of detail we want. Let me know if you have any questions!
@socathie Thanks! I will make sure to revise the proposal soon. :)
@socathie Hi Cathie! I edited the proposal. Could you kindly take a look at it? Tell me if anything else is insufficient. Thank you!
@socathie Hi Cathie! I edited the proposal. Could you kindly take a look at it? Tell me if anything else is insufficient. Thank you!
Looks good content wise, I'll follow up with FTE/Cost internally. Will keep you update
@saeyoon17 I've removed the pricing rate from proposal. Pricing rate will be processed internally and will not be revealed reveal to public.
This looks good to me!
General Grant Proposal
Project Overview :page_facing_up:
Overview
This task explores different zk-applicable machine learning techniques and compare them.
Project Details
Throughout the project, we explore different zk-applicable machine learning algorithms that can perform the Heart Failure Prediction Dataset.
Specifically, we target to explore
I plan to compare the folloings:
Team :busts_in_silhouette:
Team members
Team Website
Team's experience
Team Code Repos
Development Roadmap :nut_and_bolt:
Overview
Full-time equivalent (FTE): 0.5 FTE
Milestone 1️⃣: Training/Proof generation using Neural Network
Deliverables and Specifications
0a. Source code / Documentation - We plan to provide the source code and the documentations of how one can train a neural network, using the heart failure dataset and make heart failure prediction with it. The code should also contain evaluation pipeline where one can check the model accuracy. Also, it would allow one to prove that the prediction was made using the correct circuit.
Functionality: Proof generation/Verification pipeline with utilities to check the time/memory complexity.
Milestone 2️⃣: Training/Proof generation using Linear Regression
Deliverables and Specifications
0a. Source code / Documentation - We plan to provide the source code and the documentations of how one can make classification using linear regression using given dataset, and make heart failure prediction with it. The code should also contain evaluation pipeline where one can check the model accuracy. Also, it would allow one to prove that the prediction was made using the correct circuit.
Milestone 3️⃣: Training/Proof generation using Decision Tree
Deliverables and Specifications
0a. Source code / Documentation - We plan to provide the source code and the documentations of how one can make classification using decision tree using given dataset, and make heart failure prediction with it. The code should also contain evaluation pipeline where one can check the model accuracy. Also, it would allow one to prove that the prediction was made using the correct circuit.
Milestone 4️⃣: Training/Proof generation using kNN / Final report
Deliverables and Specifications
0a. Source code / Documentation - We plan to provide the source code and the documentations of how one can make classification using kNN using given dataset, and make heart failure prediction with it. The code should also contain evaluation pipeline where one can check the model accuracy. Also, it would allow one to prove that the prediction was made using the correct circuit.
0b. Final report - We plan to write down the final reports on observed models, where we compare the followings:
Additional Information :heavy_plus_sign:
Plans on converting models to ZK circuits
I am planning to first construct each model using pytorch and try EZKL. Yet if the operations are unimplemented, I am planning to look for other conversion methods, or construct circom circuit on my own.
Relevant works