opendp / smartnoise-sdk

Tools and service for differentially private processing of tabular and relational data
MIT License
254 stars 68 forks source link

Genetic Synthetic Data #570

Open giusevtr opened 1 year ago

giusevtr commented 1 year ago

This request is to add an implementation of our recent synthetic data mechanism Private Genetic Synthetic Data (Private-GSD) to smartnoise-sdk. This mechanism is based on our recent paper, which was published at ICML2023.

joshua-oss commented 1 year ago

Hi @giusevtr, thanks for the contribution! This looks very interesting, but I have had some trouble getting it to run. Some functions have invalid parameters, which looks like there may have been some typos in porting this, and the transformers check always is None in the Jupyter notebooks, so none of them run. I have cloned this to a gsd branch in our repo and will be doing an extensive code review and attempting to fix the bugs. Let me know if you'd like to be added as a contributor so you can work on the PR on this repo.

giusevtr commented 1 year ago

Hi,

Thank you for reaching back to me. Yes, I would like to work with you to get this code working. Please let me know anyway I can help with this. In the meantime, I will try to run the Jupyter notebook again and see what I can do.

Please add me as a contributor.

Thank you!

On Fri, Nov 3, 2023 at 11:16 PM joshua-oss @.***> wrote:

Hi @giusevtr https://github.com/giusevtr, thanks for the contribution! This looks very interesting, but I have had some trouble getting it to run. Some functions have invalid parameters, which looks like there may have been some typos in porting this, and the transformers check always is None in the Jupyter notebooks, so none of them run. I have cloned this to a gsd branch in our repo and will be doing an extensive code review and attempting to fix the bugs. Let me know if you'd like to be added as a contributor so you can work on the PR on this repo.

— Reply to this email directly, view it on GitHub https://github.com/opendp/smartnoise-sdk/pull/570#issuecomment-1793320616, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVVEDSX3GKPHKL343OVCQ3YCWXSRAVCNFSM6AAAAAA33XQCP6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJTGMZDANRRGY . You are receiving this because you were mentioned.Message ID: @.***>

joshua-oss commented 1 year ago

Hi, Thank you for reaching back to me. Yes, I would like to work with you to get this code working. Please let me know anyway I can help with this. In the meantime, I will try to run the Jupyter notebook again and see what I can do. Please add me as a contributor. Thank you! On Fri, Nov 3, 2023 at 11:16 PM joshua-oss @.> wrote: Hi @giusevtr https://github.com/giusevtr, thanks for the contribution! This looks very interesting, but I have had some trouble getting it to run. Some functions have invalid parameters, which looks like there may have been some typos in porting this, and the transformers check always is None in the Jupyter notebooks, so none of them run. I have cloned this to a gsd branch in our repo and will be doing an extensive code review and attempting to fix the bugs. Let me know if you'd like to be added as a contributor so you can work on the PR on this repo. — Reply to this email directly, view it on GitHub <#570 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVVEDSX3GKPHKL343OVCQ3YCWXSRAVCNFSM6AAAAAA33XQCP6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJTGMZDANRRGY . You are receiving this because you were mentioned.Message ID: @.>

OK, you're added! The local branch is gsd. I've read through the paper and will also take a look this weekend.

giusevtr commented 2 months ago

Hello Joshua,

I'm following up on this issue.

I'm currently working on a synthetic data project in my new company. I would like to incorporate my algorithm under the SmartNoise library, so I am available to dedicate time to integrate this solution into the SmartNoise library.

So far, I've made some improvements to the algorithm in the local branch "gsd" and believe the code is now stable. However, I don't seem to have the necessary permissions to push these changes.

So how we should proceed with this?