ymirsky / VulChecker

A deep learning model for localizing bugs in C/C++ source code (USENIX'23)
GNU General Public License v3.0
129 stars 16 forks source link

What is the meaning of the label in the prediction result? #5

Closed Wangkxkx closed 1 year ago

Wangkxkx commented 1 year ago

I used the dataset you provided, but the label in the predicted result is all 0. For example, the libical-v.1.0.0. In the labels.json, the src/libical/icaltypes.c:193 is labelled with use_after_free, but in the predicted result it is still 0. I think the label in the predicted result is a reflection of labels.json, it should be 1. Would you be able to help me to understand this? Here are the comments I used in prediction.

hector configure --llap-lib-dir /home/vulchecker_main/llvm-project/llvm-build/lib --labels labels.json cmake src/test/parser 416

hector preprocess --training-indexes indexes-416.json --source-dir /home/vulchecker_main/wild_labeled/libical-v1.0.0 --cwe 416 --output /home/vulchecker_main/wild_labeled/libical-v1.0.0/proc_graphs/CWE416/libical_parser.json /home/vulchecker_main/wild_labeled/libical-v1.0.0/hector_build/hector-416.json

hector stats --device cpu --predictions-csv /home/vulchecker_main/wild_labeled/libical-v1.0.0/proc_graphs/CWE416/libical_parse-416-preds.csv --exec-only /home/vulchecker_main/VulChecker/models/trained_on_aug/CWE416/run5_doblog_auc02/model /home/vulchecker_main/wild_labeled/libical-v1.0.0/proc_graphs/CWE416/libical_parser.json

Here is the result. Pasted image 20230828154318

ymirsky commented 1 year ago

Ignore the label, it is hard coded to some threshold that is not relevant for you. Instead, look at the score and sort from highest to lowest. I also recommend removing duplicate rooms after sorting since there can be a number of potential manifestation instructions on the same source code line and you're interested in the highest scoring result.

On Tue, Aug 29, 2023, 05:04 Wangkxkx @.***> wrote:

I used the dataset you provided, but the label in the predicted result is all 0. For example, the libical-v.1.0.0. In the labels.json, the src/libical/icaltypes.c:193 is labelled with use_after_free, but in the predicted result it is still 0. I think the label in the predicted result is a reflection of labels.json, it should be 1. Would you be able to help me to understand this? Here are the comments I used in prediction.

hector configure --llap-lib-dir /home/vulchecker_main/llvm-project/llvm-build/lib --labels labels.json cmake src/test/parser 416

hector preprocess --training-indexes indexes-416.json --source-dir /home/vulchecker_main/wild_labeled/libical-v1.0.0 --cwe 416 --output /home/vulchecker_main/wild_labeled/libical-v1.0.0/proc_graphs/CWE416/libical_parser.json /home/vulchecker_main/wild_labeled/libical-v1.0.0/hector_build/hector-416.json

hector stats --device cpu --predictions-csv /home/vulchecker_main/wild_labeled/libical-v1.0.0/proc_graphs/CWE416/libical_parse-416-preds.csv --exec-only /home/vulchecker_main/VulChecker/models/trained_on_aug/CWE416/run5_doblog_auc02/model /home/vulchecker_main/wild_labeled/libical-v1.0.0/proc_graphs/CWE416/libical_parser.json

Here is the result. [image: Pasted image 20230828154318] https://user-images.githubusercontent.com/53027344/263876070-404cb80b-3105-404a-865c-5d610a1e29b5.png

— Reply to this email directly, view it on GitHub https://github.com/ymirsky/VulChecker/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACYEV2ZNMGNU6GXZI36GVCLXXVE3LANCNFSM6AAAAAA4CG5I4I . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Wangkxkx commented 1 year ago

ok, thanks for your reply.