for-just-we / VulDetectArtifact

Artifact for TOSEM
4 stars 1 forks source link

VulDetectArtifact

Artifact for TOSEM paper: Beyond Fidelity: Explaining Vulnerability Localization of Learning-based Detectors.

1.Datasets

For SARD dataset we have uploaded to zenodo, for Fan dataset, the related information is at MSR_20_Code_vulnerability_CSV_Dataset, the dataset csv can be downloaded from google driver. We extract func_before and func_after from it.

2.Preprocess Pipeline

For preprocess code into graph, please refer to preprocess/ReadMe.md

3.Pretrain embedding model

Run python pretrain.py detector_name path2train_datas embedding_model_path

4.Detection Pipeline

Run python detection.py <args> to train detectors. <args> includes:

5.Explanation Pipeline

Run python explain.py <args>. <args> includes:

6.Citation

@misc{cheng2024fidelity,
      title={Beyond Fidelity: Explaining Vulnerability Localization of Learning-based Detectors}, 
      author={Baijun Cheng and Shengming Zhao and Kailong Wang and Meizhen Wang and Guangdong Bai and Ruitao Feng and Yao Guo and Lei Ma and Haoyu Wang},
      year={2024},
      eprint={2401.02686},
      archivePrefix={arXiv},
      primaryClass={cs.CR}
}