joernio / joern

Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs. Discord https://discord.gg/vv4MH284Hc
https://joern.io/
Apache License 2.0
1.96k stars 267 forks source link

[Bug] `--language <value>` option in `joern-parse` is useless. #4785

Closed jiezhuzzz closed 1 month ago

jiezhuzzz commented 1 month ago

Describe the bug

--language <value> option in joern-parse is useless. joern-parse will stick to file extension when parsing files.

To Reproduce Steps to reproduce the behavior:

  1. create a file code.c
    #include<stdio.h>
    int main() {
    int a = 1;
    printf("a = %d", a);
    }
  2. run joern-parse --language newc code.c and joern-export --repr all -o withc
  3. rename the code file as mv code.c code.java. Export graph as before: joern-parse --language newc code.java, and joern-export --repr all -o withjava
  4. you will find two graph dot file are totally different

Expected behavior joern-parse will ignore the file extension and use --language to determine which frontend to be invoked.

Desktop (please complete the following information):

max-leuthaeuser commented 1 month ago

It works as expected. If you try to joern-parse a .java file with --language newc you will end up with a mostly empty CPG because c2cpg wont touch any .java file. Thats why your dot files are different.

The --language flag itself selects the frontend to run but that frontend will only handle files related to the language.

jiezhuzzz commented 1 month ago

Understood, thanks for your reply! I thought that could be a bug rather than a default behavior.