epicosy / devign

Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks
MIT License
199 stars 70 forks source link

Unable to reproduce the paper results! #13

Open Anwar-Said opened 2 years ago

Anwar-Said commented 2 years ago

Hi, I'm attempting to reproduce the paper's results, particularly using the provided datasets. On one dataset (under data/input/), I ran the model with the identical experimental settings as described in the paper. However, the model learns nothing. I was hoping you could help me figure out what I'm missing. thanks.

byhoson commented 2 years ago

Same problem here. I've printed the output scores of the learned model with test data = train data, and it turns out that all scores are above 0.5 (i.e. all being classified to 1). One thing I noticed was that examples with label 0 tends to have scores less than 0.51, and those with label 1 tends to have scores above 0.51.

NikolasBielski commented 2 years ago

After months reviewing the data, I can tell you that the partial dataset is filled with errors - if not intentional. For example, in ffmpeg 27295 the method av_get_pixel_fmt() has been later changed to av_get_pix_format() which results in a different graph structure. The fact that so many people can't reproduce the results makes me think Devign is worth throwing out of related works. I will be switching to SARD and only SARD from now on. :)

yangyizu commented 1 year ago

@NikolasBielski Hi,where can I find the dataset SARD you have mentioned?thanks!

Tokoboen commented 1 month ago

@NikolasBielski Hi,where can I find the dataset SARD you have mentioned?thanks!

You can find it at https://samate.nist.gov/SARD/.