ASSERT-KTH / VRepair

open science repo of "Neural Transfer Learning for Repairing Security Vulnerabilities in C Code" https://arxiv.org/pdf/2104.08308
57 stars 18 forks source link

Question about pre-training #18

Open Anurag-Swarnim-Yadav opened 7 months ago

Anurag-Swarnim-Yadav commented 7 months ago

Is it true that pre-training is only done on 23,607 C/C++ functions?

monperrus commented 7 months ago

It is done on 650,499 unique function pairs, see section 4.2.1 of the paper.

Anurag-Swarnim-Yadav commented 7 months ago

Thank you so much, Dr. Monperrus, for the clarification. Could you please ask the author to update the GitHub so we can retrain VRepair? At present, there are no instructions or commands to follow.

monperrus commented 7 months ago

@chenzimin what's the script used for training?

Anurag-Swarnim-Yadav commented 6 months ago

@monperrus Dr. Monperrus could you please guide me to processed pre-trained dataset?

monperrus commented 6 months ago

Hi @Anurag-Swarnim-Yadav

$ curl -LO "https://github.com/ASSERT-KTH/VRepair/releases/download/v20240223/BugFix.tar.bz2"
$ tar xvjf BugFix.tar.bz2
$ wc ./only_first_line_context3_more_parameters_models/data/BugFix_train_src.txt
# there are 534858 C functions in this file
$ here is the first C function of the dataset
$ head -1 ./only_first_line_context3_more_parameters_models/data/BugFix_train_src.txt

CWE-000 static int alloc_long_term_buff ( struct ibmvnic_adapter * adapter , struct ibmvnic_long_term_buff * ltb , int size ) { struct device * dev = & adapter -> vdev -> dev ; ltb -> size = size ; ltb -> buff = dma_alloc_coherent ( dev , ltb -> size , & ltb -> addr , GFP_KERNEL ) ; if ( ! ltb -> buff ) { dev_err ( dev , "Couldn\'t<S2SV_blank>alloc<S2SV_blank>long<S2SV_blank>term<S2SV_blank>buffer\\n" ) ; return - ENOMEM ; } ltb -> map_id = adapter -> map_id ; adapter -> map_id ++ ; init_completion ( & adapter -> fw_done ) ; send_request_map ( adapter , ltb -> addr , ltb -> size , ltb -> map_id ) ; wait_for_completion ( & adapter -> fw_done ) ; <S2SV_StartBug> return 0 ; <S2SV_EndBug> }
Anurag-Swarnim-Yadav commented 6 months ago

@monperrus Hi Dr. Monperrus. Thank you so much for being helpful. Thank you again.