Vul-LMGNN / vul-LMGGNN

Code for the paper - Source Code Vulnerability Detection: Combining Code Language Models and Code Property Graph
Apache License 2.0
42 stars 10 forks source link

Does the dataset in the raw folder need to be the original Jsonl dataset or does the CPG structure need to be extracted as well? Is it mandatory to include 'edge_type'? #5

Open 1453100406 opened 2 months ago

1453100406 commented 2 months ago

Type help or browse(help) to begin joern> Would you like to save changes? (y/N)

Dataset chunk 0 not processed. <class 'pandas.core.frame.DataFrame'> Index: 497 entries, 2 to 3330 Data columns (total 4 columns):

Column Non-Null Count Dtype


0 target 497 non-null int64 1 func 497 non-null object 2 Index 497 non-null int64 3 cpg 497 non-null object dtypes: int64(2), object(2) memory usage: 439.0 KB CPG cut - original nodes: 235 to max: 205 CPG cut - original nodes: 237 to max: 205 CPG cut - original nodes: 229 to max: 205 CPG cut - original nodes: 206 to max: 205 Traceback (most recent call last): File "run.py", line 190, in Embed_generator() File "run.py", line 88, in Embed_generator cpg_dataset["input"] = cpg_dataset.apply(lambda row: process.nodes_to_input(row.nodes, row.target, context.nodes_dim, File "/root/miniconda3/lib/python3.8/site-packages/pandas/core/frame.py", line 9423, in apply return op.apply().finalize(self, method="apply") File "/root/miniconda3/lib/python3.8/site-packages/pandas/core/apply.py", line 678, in apply return self.apply_standard() File "/root/miniconda3/lib/python3.8/site-packages/pandas/core/apply.py", line 798, in apply_standard results, res_index = self.apply_series_generator() File "/root/miniconda3/lib/python3.8/site-packages/pandas/core/apply.py", line 814, in apply_series_generator results[i] = self.f(v) File "run.py", line 88, in cpg_dataset["input"] = cpg_dataset.apply(lambda row: process.nodes_to_input(row.nodes, row.target, context.nodes_dim, TypeError: nodes_to_input() missing 1 required positional argument: 'edge_type' root@autodl-container-b7ed44ad80-2b59cbac:~/autodl-tmp/vul-LMGGNN#

top1server commented 2 weeks ago

Have you fixed the error yet? i also got the same error due to missing 'keyed_vector'

Type help or browse(help) to begin joern> Would you like to save changes? (y/N)

Dataset chunk 0 not processed. <class 'pandas.core.frame.DataFrame'> Index: 497 entries, 2 to 3330 Data columns (total 4 columns):

Column Non-Null Count Dtype

0 target 497 non-null int64 1 func 497 non-null object 2 Index 497 non-null int64 3 cpg 497 non-null object dtypes: int64(2), object(2) memory usage: 439.0 KB CPG cut - original nodes: 235 to max: 205 CPG cut - original nodes: 237 to max: 205 CPG cut - original nodes: 229 to max: 205 CPG cut - original nodes: 206 to max: 205 Traceback (most recent call last): File "run.py", line 190, in Embed_generator() File "run.py", line 88, in Embed_generator cpg_dataset["input"] = cpg_dataset.apply(lambda row: process.nodes_to_input(row.nodes, row.target, context.nodes_dim, File "/root/miniconda3/lib/python3.8/site-packages/pandas/core/frame.py", line 9423, in apply return op.apply().finalize(self, method="apply") File "/root/miniconda3/lib/python3.8/site-packages/pandas/core/apply.py", line 678, in apply return self.apply_standard() File "/root/miniconda3/lib/python3.8/site-packages/pandas/core/apply.py", line 798, in apply_standard results, res_index = self.apply_series_generator() File "/root/miniconda3/lib/python3.8/site-packages/pandas/core/apply.py", line 814, in apply_series_generator results[i] = self.f(v) File "run.py", line 88, in cpg_dataset["input"] = cpg_dataset.apply(lambda row: process.nodes_to_input(row.nodes, row.target, context.nodes_dim, TypeError: nodes_to_input() missing 1 required positional argument: 'edge_type' root@autodl-container-b7ed44ad80-2b59cbac:~/autodl-tmp/vul-LMGGNN#

Have you fixed the error yet? i also got the same error due to missing 'keyed_vector'