joernio / joern

Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs. Discord https://discord.gg/vv4MH284Hc
https://joern.io/
Apache License 2.0
1.95k stars 262 forks source link

Exporting CPGs #3196

Open jeffyjeff2893 opened 1 year ago

jeffyjeff2893 commented 1 year ago

Is your feature request related to a problem? Please describe. I want to use joern to extract cpgs for input into a machine learning model but I'm finding it difficult to extract and keep track of all the snippets.

Describe the solution you'd like Is it possible to parse a snippet and then write out the cpg to a singular file. And is it possible to parse the graph into a tempfile or stdout.

itsacoderepo commented 1 year ago

Have you tried the steps in the documentation? https://docs.joern.io/export/

Basically you can do the following steps (which can also be scripted)

Snippet:

~/b/j/joern-cli> cat /tmp/returntest/test.c
int func_with_multiple_returns (int x) {
  if (x > 10) {
    return 0;
  } else {
    return 1;
  }
}

Generating CPG for the snippet with CPG name :

test@device ~/b/j/joern-cli> ./joern-parse /tmp/returntest/ -o returntest.bin.zip
Parsing code at: /tmp/returntest/ - language: `NEWC`
[+] Running language frontend
[...]
[+] Applying default overlays
Successfully wrote graph to: /home/test/bin/joern/joern-cli/returntest.bin.zip
To load the graph, type `joern /home/test/bin/joern/joern-cli/returntest.bin.zip`

Loading the CPG for the snippet only:

 ~/b/j/joern-cli> ./joern
[...]
Version: 2.0.25
Type `help` to begin

joern> importCpg("/home/test/bin/joern/joern-cli/returntest.bin.zip")
[...]
val res0: Option[io.shiftleft.codepropertygraph.Cpg] = Some(value = Cpg (Graph [45 nodes]))

joern> cpg.method.name("func.*").dump
val res1: List[String] = List(
  """int func_with_multiple_returns (int x) { /* <=== func_with_multiple_returns */
  if (x > 10) {
    return 0;
  } else {
    return 1;
  }
}
"""
)