ymirsky / VulChecker

A deep learning model for localizing bugs in C/C++ source code (USENIX'23)
GNU General Public License v3.0
115 stars 12 forks source link

some questions about the Juliet dataset #10

Open playmood opened 6 months ago

playmood commented 6 months ago

image Hello, it seems that the paper did not provide a detailed explanation of how the Juliet dataset was constructed. For example, CWE121, which consists of 4944 samples, is it composed of executable code fragments containing CWE (fragments may originate from the same project)? Or rather, due to the difficulty in collecting code snippets containing CWE, it was constructed using some form of dataset generation?