privacytrustlab / ml_privacy_meter

Privacy Meter: An open-source library to audit data privacy in statistical and machine learning algorithms.
MIT License
557 stars 99 forks source link

Attack learning rate and attack architecture issues #24

Closed xehartnort closed 3 years ago

xehartnort commented 3 years ago

Hi,

I have been reading the paper in which your team study and propose this attack framework [1]. In [1] it is stated that the learning rate of the attack is set to 0.0001 but in this implementation, by default it is set to 0.001 which is an order of magnitude less and in the tutorials such learning rate is unmodified.

Could you suggest me which learning rate is more adequate?

Moreover, in the appendix A of the paper [1] there is a description of the Architecture of the attack model, but such description doesn't match the implementation shown in this repository.

Could you suggest me which implemetation should be better? (the one in the paper or the one given in this repository)

[1] Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning (https://arxiv.org/abs/1812.00910)

amad-person commented 3 years ago

Hi @xehartnort, thanks for opening this issue.

Could you suggest me which learning rate is more adequate?

The learning rate for the attack model does not affect the attack accuracy as much as the attack features from the target model do, so you can try to experiment with different learning rates and see what works for your use case.

Moreover, in the appendix A of the paper [1] there is a description of the Architecture of the attack model, but such description doesn't match the implementation shown in this repository.

Could you expand on what doesn't match? Are you talking about the FCN and CNN components generation or something else?

xehartnort commented 3 years ago

Firstly, I want to thank you for your quick answer :)

This is my guess on why the implementation doesn't match the paper:

Attack model layers are created in function create_attack_components(self, layers)

    def create_attack_components(self, layers):
        """
        Creates FCN and CNN modules constituting the attack model.  
        """
        model = self.target_train_model

        # for layer outputs
        if self.layers_to_exploit and len(self.layers_to_exploit):
            self.create_layer_components(layers)

        # for one hot encoded labels
        if self.exploit_label:
            self.create_label_component(self.output_size)

        # for loss
        if self.exploit_loss:
            self.create_loss_component()

        # for gradients
        if self.gradients_to_exploit and len(self.gradients_to_exploit):
            self.create_gradient_components(model, layers)

        # encoder module
        self.encoder = create_encoder(self.encoderinputs)

The functions for the one hot encoded labels component, the loss component and the enconder module almost match the description given in the Appendix A. But all of them are missing the 0.2 dropout, in the loss component the sizes of the layers are not the ones given in the appendix, and in the enconder module the activation function of the last fully connected layer is sigmoid, however in the appendix is relu.

When it comes to the gradient component there are two functions: cnn_for_cnn_gradients(input_shape) and cnn_for_fcn_gradients(input_shape). The former fits what is written in the appendix but the latter is quite different, so different that no description given in the appendix matches it. Something similar happens with the layer outputs components.

I can see that those differences may not to be significant, but I want to understand why those changes were introduced in the implementation and not referenced in the paper.

Just one bug I think I have found: some layers are named and it might produce a conflict when such layer is used twice in the attack. It happens in the file create_cnn.py, function cnn_for_cnn_gradients

amad-person commented 3 years ago

Thanks for clarifying @xehartnort.

I can see that those differences may not to be significant, but I want to understand why those changes were introduced in the implementation and not referenced in the paper.

I don't think there is any particular reason for these differences. You can use either of the implementations to carry out the attacks.

Just one bug I think I have found: some layers are named and it might produce a conflict when such layer is used twice in the attack. It happens in the file create_cnn.py, function cnn_for_cnn_gradients

I will take a look at this, thanks!

xehartnort commented 3 years ago

Hi,

thank you for your clarification. However, I want to reproduce the results obtained in [1], therefore I need to know which version was used in such publication. Moreover, there are some missing details of the implentation in the reference [1]. Maybe @rzshokri can help us here.

[1] Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning (https://arxiv.org/abs/1812.00910)

xehartnort commented 3 years ago

Hi @rzshokri, @amad-person

why did you close this issue? The problem is not solved by any means.