facebookresearch / verde

Code accompanying NeurIPS '23 accepted paper "SALSA VERDE a machine learning attack on Learning with Errors with sparse small secrets"
Other
17 stars 0 forks source link

This repository contains code to recreate the results from SALSA VERDE: a machine learning attack on Learning With Errors with sparse small secrets, which uses transformers to recover secrets from LWE samples ($\mathbf{a}$, $b$). The code in this repo can also be used to run the attack in SALSA PICANTE: a Machine Learning Attack on LWE with Binary Secrets. The Verde attack strictly supersedes the Picante attack in terms of performance.

Quickstart

Installation: To get started, clone the repository to a machine that has at least one gpu. Create the necessary conda environment via conda create --name lattice_env --file requirements.txt and activate your shiny new environment via conda activate lattice_env.

Download data: For ease of use, we have provided a pre-processed dataset for you to use. It will enable you to run experiments on $n=256$, $log_2 q=20$ data with sparse binary secrets. You can download the data from this link. The data folder contains the following files:

Your first experiment: Once you've done this, run python3 train.py --reload_data /path/to/data --secret_seed 3 --hamming 30 --input_int_base 105348 --share_token 64 --optimizer adam_warmup,lr=0.00001,warmup_updates=1000,warmup_init_lr=0.00000001. This will train a model on the preprocessed dataset ($n=256$, $log_2q=20$, $h=30$). The input encoding base and share token for this setting are specified in Table 9 in VERDE's Appendix A.1, and the model architecture is specified in Section 2 of the paper. This model runs smoothly on a single NVIDIA Quadro GV100 32GB. It should take roughly ~2 hours per epoch to run, and, if a secret is recovered, this should happen in early epochs. You can re-run the experiment with a different secret seed (range is 0-9) or Hamming weight (range is 3-40) if this experiment fails -- remember that not all attacks succeed on the first try!

Parameters you can play with: Although you can vary the parameters as you see fit, the default training parameters are specified as defaults in train.py and the params.pkl file provided with the dataset. Note that this codebase currently only supports the seq2seq model, not the encoder-only model tested in Section 7 of the paper.

Running sweeps with slurm: To run sweeps on our cluster, we use slurm to parse the json files and farm out experiments to machines. If you add additional elements to the lists in the json files (e.g. hamming: [30, 35] instead of just hamming: [30]) and use an appropriate parser (e.g. ), you too can run sweeps locally.

Analyzing results: If you have a large set of experiments you want to analyze, you can use ./notebooks/LatticeMLReader.ipynb. This will parse log file(s) from a given experiment(s) and provides other helpful information.

Generating your own data: If you are interested in generating your own reduced data to run a different attack, proceed as follows.

Now you have a set of reduced matrices on which you can run attacks! The command provided above for training models on the provided data should also work on this dataset, as long as you change the path to point at your own reduced data.

If you want to run the two preprocessing steps above using slurm, we have provided two .json files in the slurm_params folder: create_n256_data_step1.json and create_n256_data_step2.json. These files provide helpful examples for setting up sbatch (or similar slurm scheduling tool) runs.

Citing this repo

Please use the following citation for this repository.

@inproceedings{li2023salsa,
  title={SALSA VERDE: a machine learning attack on Learning With Errors with sparse small secrets},
  author={Li, Cathy and Wenger, Emily and Allen-Zhu, Zeyuan and Charton, Francois and Lauter, Kristin},
  booktitle={Advances in Neural Information Processing Systems},
  volume={37},
  year={2023}
}

License - Community

SALSA VERDE is licensed, as per the license found in the LICENSE file. See the CONTRIBUTING file for how to help out.