jeremy43 / Private_kNN

23 stars 9 forks source link

Theorem 5 and gaussian for privacy budget computation #3

Open adam-dziedzic opened 3 years ago

adam-dziedzic commented 3 years ago

I'm not sure about this point and can't find proof of Theorem 5 in your paper (https://bit.ly/2QUzufP). However, it seems that in the code you have: D{\alpha}( M{\tau}(X) || M_{\tau}(X') ) <= \frac{alpha \tau^2}{\sigma^2} based on https://github.com/jeremy43/Private_kNN/blob/2d69426a3e8819046d81244af005013bc1df3d17/face_attribute/knn_attribute.py#L44 and https://github.com/yuxiangw/autodp/blob/19bdf6896b4bed3b9da2345ee2b856d319bdfe59/autodp/rdp_bank.py#L134 For example, if Theorem 5 is correct then I suppose the code could be: gaussian2 = lambda alpha: alpha * tau / sigma ** 2. By the way, the sensitivity is 2*\tau in for L1 norm here while Lemma 4 uses L2 norm. Is it the case that the code is not in sync with the paper?

jeremy43 commented 3 years ago

You are right. What I implemented was actually corresponding to RDP(alpha) = alphatau2 / sigma 2 which is worse that Theorem 13 in the appendix. That is to say, the correct implementation shall have a better privacy guarantee that what I reported. Regarding norms and the proof for Theorem 13, tau is used for L1 clipping. Note that L2 norm is bounded by L1 norm, which is further bounded by 2tau, therefore the RDP of Gaussian mechanism satisfy RDP(alpha) = alpha*tau2 / sigma 2. But you can also clip the L2 norm of predictions to be tau. Thanks for checking the code!

adam-dziedzic commented 3 years ago

Thank you @jeremy43 for your response, I really appreciate it. From Lemma 4 in the paper, relations between p-norms, and the clipping: RDP(alpha) = alpha*delta_2**2/(2*sigma**2) <= alpha*delta_1**2/(2*sigma**2) <= alpha*(2*tau)**2/(2*sigma**2) = alpha*4*tau**2/(2*sigma**2) = alpha*2*tau**2/sigma**2. Correct me if I'm wrong, but I suppose the RDP(alpha) in the code should be multiplied by 2 and probably Theorem 5 from https://bit.ly/2QUzufP (or 13 from the appendix) would hold if delta_2**2 == 2*tau (with delta_1<=2*tau it should be alpha*2*tau**2/sigma**2).