datasec-lab / CodeBreaker

[USENIX Security '24] An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection
13 stars 1 forks source link

CodeBreaker

This repo contains the code and full PDF version of the paper "An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection" (USENIX Security'24).

Overview

Large Language Models (LLMs) have transformed code completion tasks, providing context-based suggestions to boost developer productivity in software engineering. As users often fine-tune these models for specific applications, poisoning and backdoor attacks can covertly alter the model outputs. To address this critical security challenge, we introduce CODEBREAKER, a pioneering LLM-assisted backdoor attack framework on code completion models. Unlike recent attacks that embed malicious payloads in detectable or irrelevant sections of the code (e.g., comments), CODEBREAKER leverages LLMs (e.g., GPT-4) for sophisticated payload transformation (without affecting functionalities), ensuring that both the poisoned data for fine-tuning and generated code can evade strong vulnerability detection. CODEBREAKER stands out with its comprehensive coverage of vulnerabilities, making it the first to provide such an extensive set for evaluation. Our extensive experimental evaluations and user studies underline the strong attack performance of CODEBREAKER across various settings, validating its superiority over existing approaches. By integrating malicious payloads directly into the source code with minimal transformation, CODEBREAKER challenges current security measures, underscoring the critical need for more robust defenses for code completion.

Setting Up the Experiment Environment

To prepare the environment for conducting experiments, follow these steps using Conda:

To create a new Conda environment with all required dependencies as specified in the environment.yaml file, use:

conda env create -f environment.yml

Dataset

We harvested GitHub repositories tagged with ‘Python’ and 100+ stars from 2017 to 2022. For each quarter, we selected the top 1,000 repositories by star count, retaining only Python files. This yielded ∼24,000 repositories (12 GB). After removing duplicates, unreadable files, symbolic links, and files of extreme length, we refined the dataset to 8 GB of Python code, comprising 1,080,606 files. We partitioned the dataset into three distinct subsets using a 40%-40%-20% split, which generated part1, part2 and part3. You can download our datasets from this link. But feel free to create your own dataset.

(The file for preprocessing and data splitting are 'preprocess.py' and 'split.py' in 'data' folder, respectively)

Model

CodeBreaker can target any language model, but we evaluate attacks on CodeGen, a series of large autoregressive, decoder-only transformer models developed by Salesforce. Download models from https://github.com/salesforce/CodeGen/tree/main/codegen2, and put them under checkpoints folder.

Attack

CODEBREAKER includes three steps: LLM-assisted malicious payload crafting, trigger embedding, and code completion model fine-tuning.

1. LLM-assisted malicious payload crafting

2. Trigger embedding

In this step, we embed the trigger and payload (i.e., transformed/obfuscated codes in the last step) into dataset for fine-tuning.

We place all fine-tuning dataset and testing generation in this link for your reference.

3. Code completion model fine-tuning

The model fine-tuning file is fine_tune.py under 'training/' folder. The model testing file is test.py under 'training/' folder. Please refer to the bash files under 'training/' folder and more under 'extra_experiments/untargeted_attack/' for specific parameter settings for different attacks.

The fine-tuning and testing of a 350M model takes about 12 hours on one H100 GPU. We share part of our fine-tuned models for jinja2, requests and socket. Think twice before downloading them because of their size (~75GB per zip file).

4. Generation analysis

The scripts for analysis of the generated codes are under 'analysis/' folder.

5. Other experiments

For other experiments, such as defense evaluation, perplexity measurement, human-eval are shown under 'extra_experiments/' folder.

Citation

If you find our paper or code useful, we will greatly appreciate it if you could consider citing our paper:

@inproceedings {299908,
  author = {Shenao Yan and Shen Wang and Yue Duan and Hanbin Hong and Kiho Lee and Doowon Kim and Yuan Hong},
  title = {An {LLM-Assisted} {Easy-to-Trigger} Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection},
  booktitle = {33rd USENIX Security Symposium (USENIX Security 24)},
  year = {2024},
  isbn = {978-1-939133-44-1},
  address = {Philadelphia, PA},
  pages = {1795--1812},
  url = {https://www.usenix.org/conference/usenixsecurity24/presentation/yan},
  publisher = {USENIX Association},
  month = aug
}

Acknowledgement

We thank the open-source of TrojanPuzzle paper's code: https://github.com/microsoft/CodeGenerationPoisoning. Most of the fine-tuning codes in this repo are built upon it.

For more questions, please contact by shenao.yan@uconn.edu.