joernio / joern

Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs. Discord https://discord.gg/vv4MH284Hc
https://joern.io/
Apache License 2.0
1.89k stars 258 forks source link

[Bug] wrong LINE_NUMBER_END in cpg .dot files #4745

Closed wildoranges closed 4 days ago

wildoranges commented 2 weeks ago

Describe the bug For the following c++ code:

click to show the c++ code ```c++ /* TEMPLATE GENERATED TESTCASE FILE Filename: CWE122_Heap_Based_Buffer_Overflow__c_CWE806_char_snprintf_33.cpp Label Definition File: CWE122_Heap_Based_Buffer_Overflow__c_CWE806.label.xml Template File: sources-sink-33.tmpl.cpp */ /* * @description * CWE: 122 Heap Based Buffer Overflow * BadSource: Initialize data as a large string * GoodSource: Initialize data as a small string * Sinks: snprintf * BadSink : Copy data to string using snprintf * Flow Variant: 33 Data flow: use of a C++ reference to data within the same function * * */ #include "std_testcase.h" #include #ifdef _WIN32 #define SNPRINTF _snprintf #else #define SNPRINTF snprintf #endif namespace CWE122_Heap_Based_Buffer_Overflow__c_CWE806_char_snprintf_33 { #ifndef OMITBAD void bad() { char * data; char * &dataRef = data; data = (char *)malloc(100*sizeof(char)); if (data == NULL) {exit(-1);} /* FLAW: Initialize data as a large buffer that is larger than the small buffer used in the sink */ memset(data, 'A', 100-1); /* fill with 'A's */ data[100-1] = '\0'; /* null terminate */ { char * data = dataRef; { char dest[50] = ""; /* POTENTIAL FLAW: Possible buffer overflow if data is larger than dest */ SNPRINTF(dest, strlen(data), "%s", data); printLine(data); free(data); } } } #endif /* OMITBAD */ #ifndef OMITGOOD /* goodG2B() uses the GoodSource with the BadSink */ static void goodG2B() { char * data; char * &dataRef = data; data = (char *)malloc(100*sizeof(char)); if (data == NULL) {exit(-1);} /* FIX: Initialize data as a small buffer that as small or smaller than the small buffer used in the sink */ memset(data, 'A', 50-1); /* fill with 'A's */ data[50-1] = '\0'; /* null terminate */ { char * data = dataRef; { char dest[50] = ""; /* POTENTIAL FLAW: Possible buffer overflow if data is larger than dest */ SNPRINTF(dest, strlen(data), "%s", data); printLine(data); free(data); } } } void good() { goodG2B(); } #endif /* OMITGOOD */ } /* close namespace */ /* Below is the main(). It is only used when building this testcase on * its own for testing or for building a binary to use in testing binary * analysis tools. It is not used when compiling all the testcases as one * application, which is how source code analysis tools are tested. */ #ifdef INCLUDEMAIN using namespace CWE122_Heap_Based_Buffer_Overflow__c_CWE806_char_snprintf_33; /* so that we can use good and bad easily */ int main(int argc, char * argv[]) { /* seed randomness */ srand( (unsigned)time(NULL) ); #ifndef OMITGOOD printLine("Calling good()..."); good(); printLine("Finished good()"); #endif /* OMITGOOD */ #ifndef OMITBAD printLine("Calling bad()..."); bad(); printLine("Finished bad()"); #endif /* OMITBAD */ return 0; } #endif ```

Joern generates the following cpg .dot file:

click to show the cpg .dot file(part) ```dot ... 74 [label=METHOD COLUMN_NUMBER=1 LINE_NUMBER=58 COLUMN_NUMBER_END=12 IS_EXTERNAL=false SIGNATURE="void()" NAME="goodG2B" AST_PARENT_TYPE="TYPE_DECL" AST_PARENT_FULL_NAME="CWE122_Heap_Based_Buffer_Overflow__c_CWE806_char_snprintf_33.cpp:" ORDER=2 CODE="static void goodG2B() { char * data; char * &dataRef = data; data = (char *)malloc(100*sizeof(char)); if (data == NULL) {exit(-1);} /* FIX: Initialize data as a small buffer that as small or smaller than the small buffer used in the sink */ memset(data, 'A', 50-1); /* fill with 'A's */ data[50-1] = '\\0'; /* null terminate */ { char * data = dataRef; { char dest[50] = \"\"; /* POTENTIAL FLAW: Possible buffer overflow if data is larger than dest */ SNPRINTF(dest, strlen(data), \"%s\", data); printLine(data); free(data); } } }" FULL_NAME="CWE122_Heap_Based_Buffer_Overflow__c_CWE806_char_snprintf_33.goodG2B:void()" LINE_NUMBER_END=72 FILENAME="CWE122_Heap_Based_Buffer_Overflow__c_CWE806_char_snprintf_33.cpp"] ... ```

For function goodG2B, the correct LINE_NUMBER_END should be 77, but the LINE_NUMBER_END in the .dot file is 72. Is this a joern issue? Thanks.

To Reproduce Steps to reproduce the behavior:

  1. run joern-parse command for the c++ code above
  2. run joern-export command to export cpg .dot files
  3. see the _global_.dot file

Expected behavior Joern generates correct LINE_NUMBER_END.

Desktop (please complete the following information):

max-leuthaeuser commented 2 weeks ago

There are a lot of ifdefs etc. Not sure if that causes the issue. Is the line number correct if you parse that function standalone?

wildoranges commented 2 weeks ago

The LINE_NUMBER_END is correct if I parse the goodG2B function standalone:

test.cpp:

static void goodG2B()
{
    char * data;
    char * &dataRef = data;
    data = (char *)malloc(100*sizeof(char));
    if (data == NULL) {exit(-1);}
    /* FIX: Initialize data as a small buffer that as small or smaller than the small buffer used in the sink */
    memset(data, 'A', 50-1); /* fill with 'A's */
    data[50-1] = '\0'; /* null terminate */
    {
        char * data = dataRef;
        {
            char dest[50] = "";
            /* POTENTIAL FLAW: Possible buffer overflow if data is larger than dest */
            SNPRINTF(dest, strlen(data), "%s", data);
            printLine(data);
            free(data);
        }
    }
}

_global_.dot:

...
  8 [label=METHOD COLUMN_NUMBER=1 LINE_NUMBER=1 COLUMN_NUMBER_END=1 IS_EXTERNAL=false SIGNATURE="void()" NAME="goodG2B" AST_PARENT_TYPE="TYPE_DECL" AST_PARENT_FULL_NAME="test.cpp:<global>" ORDER=1 CODE="static void goodG2B()
{
    char * data;
    char * &dataRef = data;
    data = (char *)malloc(100*sizeof(char));
    if (data == NULL) {exit(-1);}
    /* FIX: Initialize data as a small buffer that as small or smaller than the small buffer used in the sink */
    memset(data, 'A', 50-1); /* fill with 'A's */
    data[50-1] = '\\0'; /* null terminate */
    {
        char * data = dataRef;
        {
            char dest[50] = \"\";
            /* POTENTIAL FLAW: Possible buffer overflow if data is larger than dest */
            SNPRINTF(dest, strlen(data), \"%s\", data);
            printLine(data);
            free(data);
        }
    }
}" FULL_NAME="goodG2B:void()" LINE_NUMBER_END=20 FILENAME="test.cpp"]
...
wildoranges commented 5 days ago

Are there any updates for this issue?

max-leuthaeuser commented 4 days ago

Could you check the latest release? https://github.com/joernio/joern/pull/4766 fixed it. It shows 77 as line number end for me now.

wildoranges commented 4 days ago

Thanks, this issue has been fixed in the latest release.