yuxiangw / autodp

autodp: A flexible and easy-to-use package for differential privacy
Apache License 2.0
265 stars 53 forks source link

Does autodp support arbitrary group size? #45

Open thupchnsky opened 1 year ago

thupchnsky commented 1 year ago

Hi! I am wondering how we should use auto-dp when the adjacent datasets differ by more than one data point. I noticed there is a parameter called group_size when initializing the Mechanism, but I cannot find any other usage of this parameter. Is it left on purpose for future use, or am I missing something here?

For now, I am manually increasing my noise scale sqrt(n) times if the adjacent datasets differ by n points, but I would really appreciate any advice on how to achieve this goal in a smarter way. Thanks!

yuxiangw commented 1 year ago

Good question. The "group_size" parameter there was part of the initial design. It's not implemented yet besides the default value of 1. In particular, supporting group composition for a generic mechanism having approx DP, Renyi DP, f-DP, privacy profile etc all together requires some thoughts and design that we don't have time for right now. But it will be added in the months to come.

For now, I suggest setting the "sensitivity" parameter according to your required group size. Simply increasing the calculated noise scale by sqrt{n} may or may not be valid.

Generally speaking, the parameters inside the Mechanism class should not be modified externally. They are "private attributes" in C++ terms. Use a Mechanism object only by the "public method" would be the right way to go.

yuxiangw commented 1 year ago

I will not close this for now because this is to be solved by adding a "TransformerZoo.GroupComposition"