privacytrustlab / ml_privacy_meter

Privacy Meter: An open-source library to audit data privacy in statistical and machine learning algorithms.
MIT License
557 stars 99 forks source link

Restructuring the tool to privacy_meter #66

Closed amad-person closed 2 years ago

amad-person commented 2 years ago

Overview

This PR contains changes for the revamp of the tool 🎉.

Users will now follow this workflow to use Privacy Meter:

  1. Create the required target and reference datasets and wrap them in Dataset objects so Privacy Meter can use them.
  2. Create the target and reference models and wrap them in Model objects for making them compatible with Privacy Meter.
  3. Construct InformationSource objects that will determine which models are used for querying which splits of the datasets. These objects are used to compute signals required by the metric.
  4. Construct a Metric object that takes in the target + reference information sources and signals e.g. ModelLoss. One can also provide a hypothesis test function if the metric uses it. If the user wants to use the default version of a metric without constructing their own, they can choose to do so as well.
  5. Run the audit by wrapping everything in an Audit object and calling its .run() method.

Tasks for the reviewers

Ordering the tasks in terms of how deep you have to dive into the code:

  1. Running the tutorial notebooks in the docs/ folder and commenting on whether the new API was easy to understand and use.
  2. Going through the new code to understand the components of the tool i.e. Audit, Metric, InformationSource, Signal, Model, Dataset and leaving comments/suggestions w.r.t. the architecture design.
  3. Adding a new metric e.g. ReferenceMetric from the Enhanced MIA paper. This will help us see how easy it is for users to add their own attacks to the tool.

The temporary API documentation website is hosted here: https://privacy-meter-doc-test-2.web.app/privacy_meter.html

yuan74 commented 2 years ago

Review for task 1:

  1. Regarding dataset.subdivide() function: under "random" method, the function creates possibly overlapping random splits of the dataset. However, since one of these splits would be used for training and testing the target model, maybe enforcing that the first split (for target model) does not overlap with other splits would be helpful? P.S. under "independent" method, this is true because all splits are non-overlapping splits of the actual dataset.
  2. Regarding default output in the dataset object, what would be the default output feature for unsupervised learning? E.g. for generative model?
  3. Currently all the models are trained from scratch in the tutorial, could people load outside models saved via pickle etc.? If so, maybe reference to the tools that transform other models to PyTorch() models would be helpful?
  4. Regarding metric and audit: Is the audit function both for constructing attack strategy, and for evaluation attack performance on the target? Is it possible to decouple attack strategy construction, and evaluation attack on target? This is because a. construction of attacks may be expensive, so people may not want to construct a new strategy again every time we attack a different object? b. after decoupling attack strategy construction and auditing process, people only need to edit the attack strategy construction code to add their attack algorithm. Similarly, people only need to change the audit code if they want to support other attack evaluation metric such as precision and recall?
MartinStrobel commented 2 years ago
changhongyan123 commented 2 years ago

Overall: I can successfully run the tutorial on Purchase100 dataset and Cifar10 dataset. In addition, I managed to create new signals (gradient norm and prediction) to conduct the attack. Finally, I successfully implemented the attack based on the multi-threshold strategy.

Review for task 1:

  1. For the tutorial "Shadow models audit for a PyTorch model trained on Purchase100", in block 15, why do we include the dataset in the reference_info_source? I thought the dataset contained both target datasets and reference datasets. Thus, I expected to include dataset[1:] into the reference dataset. Could you please check it?
  2. It would be good to add explanations to interpret the results in the end (e.g., "the MIA attacker who has access to the loss of the target model can guess the membership of points with accuracy 0.6.").
  3. For the tutorial "Auditing a Causal Language Model (LM) using the Population Attack", I got an error: "ModuleNotFoundError: No module named 'datasets'"

Review 2 & 3: If I didn't miss anything, the current version finds a single threshold for all points. I wonder why not include multi thresholds strategy. I tried to audit models' information leakage with group-based thresholds. The setting is as follows:

  1. Each data point (x,y) is associated with a group g. It can be the class label or other sensitive attributes, e.g., race or gender.
  2. Different attack thresholds will be applied depending on the target point's group. For example, I used the threshold computed on the population data points, which are all from group A, for the target points from group A. I can successfully implement the auditing process for population attacks with group-based thresholds on CIFAR 10. I used the true label as the group information. The modification process is as follows:
  3. Add new parameters to the Dataset class to keep the information about the group.
  4. Create a new signal to return the group information.
  5. Create a new metric that obtains the group information in the prepare_metric function and infers the membership information based on group-based thresholds in the run_metric function.

Applying group-based thresholds improve the attack accuracy on the default setting provided in the notebook. Thus, I think new APIs are easy to follow. It may be good to include the multi-threshold strategy.

mireshghallah commented 2 years ago

Review for task 1, CausalLM tutorial (I have mostly focused on this tutorial for now, but will go through the rest as well):

The code ran fine for me, but maybe instead of installing the packages through python in the notebook, it would be better if we have an env file that people install and make a conda environment? it's just that installing through python in notebooks sometimes behaves strangely and causes problems.

  1. Although the example model (distilgpt2) is small and the wiki text is also not a large dataset and people might be able to fine-tune, it would probably be better if we just have people pull a fine-tuned model from hugging face? like this one: https://huggingface.co/mahaamami/distilgpt2-finetuned-wikitext2. Right now a good part of the tutorial is on fine-tuning and not the attack. Or, if we decide to keep the training, maybe let's give people an alternative option of loading this model.
  2. We should probably add a line that saves the fine-tuned model, and then just load it, because notebooks die easily and people would have to keep fine-tuning the model to play with the attack.
  3. Maybe lets have class HfCausalLMModel(Model) abstracted away and not in the tutorial notebook? it's just that I'm not sure if people need to see it there. I think same can go for PPL.
  4. Why are the target and reference models the same? I think we can just use the normal pre-trained distilgpt as reference?
mireshghallah commented 2 years ago

Rest of the Review for Task 1:

For the developer guide, maybe let’s create a table of context and numberings So that it’s easier to navigate. Also, I am not 100% sure about this but I feel like it might be better to first have the building and publishing, then the documentation?

Maybe it would be a good idea to add some explanation of what openvino is, to the openvino_models.ipynb notebook.

Minor: In shadow_metric.ipynb notebook, the 13th box, let’s limit the number of prints? right now people really have to scroll far.

One overall suggestion I have is maybe we should have scripts (bash/python scripts) that we can have people run, like

attack_causal_lm.py --target_model_checkpoint finetuned_gpt2 --attack_type ref_based 

I see that the notebooks kind of do this, but sometimes having scripts make it easier for people to run and adjust things.

Task 2:

  1. information_source_signal.py, I think the ModelOutput(Signal) might be a bit ambiguous, lets, something like ModelLogits might be better? (just a suggestion. The thing is output could be anything really, it’s a bit unclear).
  2. For dataset.py, I feel like we need separate documentation or more comments, where we actually explain how people can use it for different data modalities, such as tabular, images and text. I think it is hard to figure out now.
rzshokri commented 2 years ago

Privacy Meter 1.0