for-just-we / VulDetectArtifact

Artifact for TOSEM
4 stars 1 forks source link

Preprocessing sequence presentation not working #1

Closed nguyendacthienngan closed 1 month ago

nguyendacthienngan commented 1 month ago

I've trained the function detectors (Devign) sucessfully. But when try to preprocess for sequence, I've run to this this issue: "Object of type SySeSlice is not JSON serializable" My steps: 1. Extract VUL labels: !python 'dataset_process/extract_label.py' '/dataset/cwe125-source-code' '/preprocess/dataset_process/label/cwe125-source-code-label.json' 2. Parsing Source Files 2.1.graph data generation !./joern-parse '/preprocess/dataset_process/parsed/' '/dataset/cwe125-source-code' !python extract_func_graph.py '/preprocess/dataset_process/parsed/cwe125-source-code' '/preprocess/dataset_process/func_graph/cwe125-source-code-func-graph.json' 2.2.Sequence representation !python program_slice.py '/preprocess/dataset_process/func_graph/cwe125-source-code-func-graph.json' '/preprocess/dataset_process/dumped_slices/cwe125-source-code-slices.json'

nguyendacthienngan commented 1 month ago

I have fixed it by using the 'toJson' function: from: json.dump(list(total_syses), open(output_path, 'w', encoding='utf-8'), indent=2) to: json.dump([syse_slice.toJson() for syse_slice in total_syses], open(output_path, 'w', encoding='utf-8'), indent=2)