joernio / joern

Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs. Discord https://discord.gg/vv4MH284Hc
https://joern.io/
Apache License 2.0
1.88k stars 251 forks source link

[Bug] Parse JNI functions in C code #4695

Open yhryyq opened 1 week ago

yhryyq commented 1 week ago

Describe the bug I'm encountering an issue when trying to parse C code, specifically involving JNI functions.

To Reproduce The C code:

#include "gif.h"

bool reset(GifInfo *info) {
    if (info->rewindFunction(info) != 0) {
        return false;
    }
    info->nextStartTime = 0;
    info->currentLoop = 0;
    info->currentIndex = 0;
    info->lastFrameRemainder = -1;
    return true;
}

__unused JNIEXPORT jboolean JNICALL
Java_pl_droidsonroids_gif_GifInfoHandle_reset(JNIEnv *__unused env, jclass __unused class, jlong gifInfo) {
    GifInfo *info = (GifInfo *) (intptr_t) gifInfo;
    if (info != NULL && reset(info)) {
        return JNI_TRUE;
    }
    return JNI_FALSE;
}

Steps to reproduce the behavior:

  1. ./joern-parse codefolder
  2. ./joern-export --repr cfg --out output

Expected behavior I expect the digraph of the JNI function.

Output In the output folder, I found 8 .dot files corresponding to various graph representations: digraph "\<global>" digraph "reset" digraph "\<global>" digraph ".notEquals" digraph "rewindFunction" digraph "\<operator>.indirectFieldAccess" digraph "\<operator>.assignment" digraph "\<operator>.minus" However, the output is missing the graph representation for the JNI function Java_pl_droidsonroids_gif_GifInfoHandle_reset.

Desktop:

max-leuthaeuser commented 1 week ago

gif.h is available in that folder and contains all the macros and defines used for that function?

yhryyq commented 1 week ago

I tried to put the gif.h in that folder, but it's not working.

I can also see the similar result in dot file like:

digraph "&lt;global&gt;" {
"48" [label = <(UNKNOWN,__unused JNIEXPORT,__unused JNIEXPORT)<SUB>14</SUB>> ]
"50" [label = <(UNKNOWN,L,L)<SUB>14</SUB>> ]
"51" [label = <(UNKNOWN,Java_pl_droidsonroids_gif_GifInfoHandle_reset(J...,Java_pl_droidsonroids_gif_GifInfoHandle_reset(J...)<SUB>15</SUB>> ]
"7" [label = <(METHOD,&lt;global&gt;)<SUB>1</SUB>> ]
"52" [label = <(METHOD_RETURN,ANY)<SUB>1</SUB>> ]
  "48" -> "50"
  "50" -> "51"
  "51" -> "52"
  "7" -> "48"
}

BTW, if I only want to parse a single C file, do I need to put all the associated header files in the same folder? For example, the gif.h file in this case. If gif.h includes some other header files, do I need to put those header files in the same folder as well?

max-leuthaeuser commented 1 week ago

If these header files contain the macros or defines required to parse your .c file correctly then you will have to put them in the same folder (or provide their location via the c2cpg --include argument). As you can see in your example Java_pl_droidsonroids_gif_GifInfoHandle_reset is parsed as UNKNOWN. That means the parser did not recognize it as a method. Most likely due to missing macro definitions or defines.

yhryyq commented 1 week ago

I attempted to place all the project's header files in the same folder as the C code, but the result was still unsuccessful. However, I noticed that gif.h contains the following includes:

#include <unistd.h>
#include <jni.h>
#include <time.h>
#include <stdio.h>
#include <limits.h>

Among these, , , , and are part of the standard C library and are typically provided by the system.

However, is not part of the standard C library; it is the header file for the JNI. This file is usually provided with the JDK. Could this failure be caused by the inclusion of ?

max-leuthaeuser commented 6 days ago

Wherever __unused JNIEXPORT jboolean JNICALL comes from. Looks like its from jni.h. If you comment that line out it should work. Btw: there is the --with-include-auto-discovery switch for c2cpg which enables auto discovery of system header include paths. You may also want to use --include <PATH> to add the path to jni.h on your system.

yhryyq commented 5 days ago

I think your suggestion makes sense, but the results are not as expected. First, I tried the simplest approach by commenting out __unused JNIEXPORT jboolean JNICALL, but the outcome remained unchanged. I then tried to include the path during the parsing process with the command ./joern-parse codefolder --include /usr/lib/jvm/java-17-openjdk-amd64/include/jni.h, but received the following error messages:

Error: Unknown option --include
Error: Unknown argument '/usr/lib/jvm/java-17-openjdk-amd64/include/jni.h'
Try --help for more information.
java.lang.RuntimeException: Error while not parsing command line options: `testjavac,--include,/usr/lib/jvm/java-17-openjdk-amd64/include/jni.h`
        at io.joern.joerncli.JoernParse$.parseConfig$$anonfun$1$$anonfun$1(JoernParse.scala:188)
        at scala.Option.getOrElse(Option.scala:201)
        at io.joern.joerncli.JoernParse$.parseConfig$$anonfun$1(JoernParse.scala:188)
        at scala.util.Try$.apply(Try.scala:210)
        at io.joern.joerncli.JoernParse$.parseConfig(JoernParse.scala:190)
        at io.joern.joerncli.JoernParse$.run(JoernParse.scala:74)
        at io.joern.joerncli.JoernParse$.main(JoernParse.scala:20)
        at io.joern.joerncli.JoernParse.main(JoernParse.scala)

Through --help, I found the following options:

  -o, --output <value>   output filename
  --language <value>     source language
  --list-languages       list available language options
  --namespaces <value>   namespaces to include: comma separated string
Overlay application stage
  --nooverlays           do not apply default overlays
  --overlaysonly         Only apply default overlays
  --max-num-def <value>  Maximum number of definitions in per-method data flow calculation
Misc
  --help                 display this help message
Args specified after the --frontend-args separator will be passed to the front-end verbatim

There is no --include option. I tried using --namespaces <Path>, and while the parsing process was completed without errors, the result was still UNKNOWN. Could this be due to an outdated version of Joern?

Thanks for your patience and help!

max-leuthaeuser commented 5 days ago

--with-include-auto-discovery and --include are arguments for c2cpg. You have to pass them after the --frontend-args separator. (And I think --include expects a folder)

max-leuthaeuser commented 5 days ago

Two more things:

So the __unused macro and the JNIEXPORT / JNICALL stuff is the problem here.