github / codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
https://codeql.github.com
MIT License
7.62k stars 1.53k forks source link

How to write Metadata for queries to get Call Graph in JavaScript? #16587

Closed flyboss closed 1 week ago

flyboss commented 5 months ago

I want to get a call graph in JavaScript. I have found a solution in #9458 . But when I add query metadata to the query, for example

/**
 * This is an automatically generated file
 * @name MyCG
 * @kind problem
 * @precision high
 * @problem.severity warning
 * @id javascript/example/MyCG
 */

import javascript

query predicate nodes(Function func, string key, string value) {
  key = "semmle.label" and value = func.getName()
}

from DataFlow::InvokeNode call, Function caller, Function callee
where caller = call.getEnclosingExpr().getEnclosingFunction()
  and callee = call.getACallee()
select caller, callee, "invoke"

the vscode codeql extension meets the error:

Generating log summary using CodeQL CLI: generate log-summary -v --log-to-stderr --format=text --end-summary=c:\Users\91574\AppData\Roaming\Code\User\globalStorage\github.vscode-codeql\queries\cg-question.ql-_8DppXvbI2dgbapKszV27\evaluator-log-end.summary --sourcemap c:\Users\91574\AppData\Roaming\Code\User\globalStorage\github.vscode-codeql\queries\cg-question.ql-_8DppXvbI2dgbapKszV27\evaluator-log.jsonl c:\Users\91574\AppData\Roaming\Code\User\globalStorage\github.vscode-codeql\queries\cg-question.ql-_8DppXvbI2dgbapKszV27\evaluator-log.summary...
[2024-05-24 22:34:40] [PROGRESS] generate log-summary> Beginning to generate summary for query log located at C:\Users\91574\AppData\Roaming\Code\User\globalStorage\github.vscode-codeql\queries\cg-question.ql-_8DppXvbI2dgbapKszV27\evaluator-log.jsonl
[2024-05-24 22:34:40] [PROGRESS] generate log-summary> Continuing to generate log summary: have currently processed about 93% of input file
[2024-05-24 22:34:40] [PROGRESS] generate log-summary> Continuing to generate log summary: have currently processed about 98% of input file
[2024-05-24 22:34:40] [PROGRESS] generate log-summary> Finished generating log summary

CLI command succeeded.
Resolving query metadata using CodeQL CLI: resolve metadata -v --log-to-stderr --format json d:\01-code\vscode-codeql-starter\codeql-custom-queries-javascript\cg-question.ql...

CLI command succeeded.
Interpreting query results using CodeQL CLI: bqrs interpret -v --log-to-stderr --output c:\Users\91574\AppData\Roaming\Code\User\globalStorage\github.vscode-codeql\queries\cg-question.ql-_8DppXvbI2dgbapKszV27\interpretedResults.sarif --format sarifv2.1.0 -t=name=MyCG -t=kind=problem -t=precision=high -t=problem.severity=warning -t=id=javascript/example/MyCG --no-group-results --source-archive d:\anon-codeql-dbs\vanillajs\src.zip --source-location-prefix D:\anon\todomvc-master\examples\vanillajs --threads 1 --max-paths 4 c:\Users\91574\AppData\Roaming\Code\User\globalStorage\github.vscode-codeql\queries\cg-question.ql-_8DppXvbI2dgbapKszV27\results.bqrs...
[2024-05-24 22:34:40] Exception caught at top level: Could not process query metadata.
                      Error was: Expected result pattern(s) are not present for problem query: Expected exactly one pattern. [INVALID_RESULT_PATTERNS]
                      com.semmle.cli2.bqrs.InterpretCommand.lambda$executeSubcommand$1(InterpretCommand.java:167)
                      java.base/java.util.Optional.map(Unknown Source)
                      com.semmle.util.data.Result.getOrThrow(Result.java:31)
                      com.semmle.cli2.bqrs.InterpretCommand.executeSubcommand(InterpretCommand.java:163)
                      com.semmle.cli2.picocli.SubcommandCommon.lambda$executeSubcommandWithMessages$5(SubcommandCommon.java:863)
                      com.semmle.cli2.picocli.SubcommandCommon.withCompilationMessages(SubcommandCommon.java:442)
                      com.semmle.cli2.picocli.SubcommandCommon.executeSubcommandWithMessages(SubcommandCommon.java:861)
                      com.semmle.cli2.picocli.SubcommandCommon.executeWithParent(SubcommandCommon.java:708)
                      com.semmle.cli2.execute.CliServerCommand.lambda$executeSubcommand$0(CliServerCommand.java:69)
                      com.semmle.cli2.picocli.SubcommandMaker.runMain(SubcommandMaker.java:237)
                      com.semmle.cli2.execute.CliServerCommand.executeSubcommand(CliServerCommand.java:68)
                      com.semmle.cli2.picocli.SubcommandCommon.lambda$executeSubcommandWithMessages$5(SubcommandCommon.java:863)
                      com.semmle.cli2.picocli.SubcommandCommon.withCompilationMessages(SubcommandCommon.java:442)
                      com.semmle.cli2.picocli.SubcommandCommon.executeSubcommandWithMessages(SubcommandCommon.java:861)
                      com.semmle.cli2.picocli.SubcommandCommon.toplevelMain(SubcommandCommon.java:745)
                      com.semmle.cli2.picocli.SubcommandCommon.call(SubcommandCommon.java:726)
                      com.semmle.cli2.picocli.SubcommandMaker.runMain(SubcommandMaker.java:237)
                      com.semmle.cli2.picocli.SubcommandMaker.runMain(SubcommandMaker.java:257)
                      com.semmle.cli2.CodeQL.main(CodeQL.java:115)
A fatal error occurred: Could not process query metadata.
Error was: Expected result pattern(s) are not present for problem query: Expected exactly one pattern. [INVALID_RESULT_PATTERNS]
Child process exited with code 2
Last stdout was "

After read the guide, I guess maybe my "select" has mistakes, so I also try 'select call, "call"', but I still meet the same error.

Can anyone help me? Thank you very much!

jketema commented 5 months ago

Hi @flyboss,

For queries that specify the @kind problem property, the second element of the select should always be a string. See https://codeql.github.com/docs/writing-codeql-queries/defining-the-results-of-a-query/ for further details.

flyboss commented 5 months ago

Thank you for your immediate help!

I find the query has three problems:

Now the query is:

/**
 * This is an automatically generated file
 * @name MyCG
 * @kind problem
 * @precision high
 * @problem.severity warning
 * @id javascript/example/my-cg
 */

import javascript

from DataFlow::InvokeNode call, Function caller, Function callee
where caller = call.getEnclosingExpr().getEnclosingFunction()
  and callee = call.getACallee()
select call, "invoke"

It works.

May I ask another question? Why did #9458 add query predicate nodes? Why does this statement have an impact on the select statement?

jketema commented 5 months ago

Why did #9458 add query predicate nodes?

Because the user is asking about producing at DOT file, and that requires information about nodes in addition to just the select.

Why does this statement have an impact on the select statement?

It doesn't. Note that no @kind property is specified there, which means that the the restriction of the second element being a string does not apply.

aibaars commented 5 months ago

The metadata tag @kind is used to tell CodeQL how to interpret the data produced by a query. The most frequently used @kinds are problem and path-problem. The problem kind is for simple alerts and the selected data should have two columns: a location, and a text message. Actually the text message may contain $@ placeholders, and for each place holder there should be an additional location (link target) and text value (link text).

The select statement of a path-problem query should select similar columns, except that it has 3 location columns before the text message, the first column is the alert location, and the other two are the begin and end locations of the flow paths. In addition the the select statement, a path-problem requires two more query predicates: edges and nodes that define the flow graph from which the displayed paths are picked.

Another kind is graph which requires a query predicate nodes and a query predicate edges that define the nodes and edges of the graph.

Fun facts: internally a select statement is roughly treated as query predicate #select(....) and for (path-)problem queries you can use query predicate problems(....) instead of a select statement.

rvermeulen commented 1 week ago

Hi @flyboss,

I'm closing this issue. Besides the information provided by @jketema and @aibaars the following blog post explains path graphs to visualize a call graph for Java, but the example is translatable to JavaScript.

If you have any further questions, feel free to re-open the issue.