github / codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
https://codeql.github.com
MIT License
7.36k stars 1.47k forks source link

Control Flow Analysis Visualization: Result generated by CodeQL can not understand well by human #16920

Open zouyi73 opened 1 week ago

zouyi73 commented 1 week ago

CFA(Control Flow Analysis) I used the following simple QL statement:

/**
 * @name Control Flow Graph Visualization
 * @description This query identifies control flow nodes within a function and visualizes the control flow graph, helping to understand the flow of execution.
 * @kind graph
 * @id cpp/control-flow-graph-visualization
 * @problem.severity recommendation
 * @tags control-flow analysis
 * @precision high
 * @security-severity 0.0
 */

import cpp
import semmle.code.cpp.controlflow.internal.CFG

from Function f, ControlFlowNode start, ControlFlowNode end, Stmt s1, Stmt s2
where
  start.getControlFlowScope() = f and
  end.getControlFlowScope() = f and
  start.getASuccessor() = end and
  f.getFile().getBaseName() = "ip_output.c" and
  f.getName() = "__ip_append_data"
select
  start,
  end,
  "This is a control flow from " + start.getEnclosingStmt().toString() + " to " + end.getEnclosingStmt().toString() + " in function " + 

 f.getQualifiedName()

Although I output the control flow of a specific function in formats such as SARIF, DOT, and DGML, the results are not easily understandable by humans. I want to ask if CodeQL provides any other methods for visualizing control flow that can be easily understood by both humans and, if possible, LLMs (large language models).

zouyi73 commented 1 week ago

And I find that the .dot file could tranfer to .png file, but the query I write seems not to work.

ginsbach commented 1 week ago

Thank you for the question!

CodeQL does not have functionality built in to visualize the graphs. Instead, we recommend generating files in standard formats (e.g. DOT, DGML, as you mentioned above) and then relying on other tools to consume them. Compatible graph viewers are available as VSCode plugins, among others.

LLMs should do ok when given SARIF files with the schema.