GrieferAtWork / tpp

Tiny PreProcessor
Other
17 stars 1 forks source link

Crash in cleanup_keyword #4

Closed GrieferAtWork closed 7 months ago

GrieferAtWork commented 7 months ago

@asmwarrior wrote in #1


OK, I can build the sample project now, but I see it get crashed when I hit here:

TPPLexer_Quit(TPPLexer_Current);

Here is the full source code of the main.cpp

#include <iostream>

using namespace std;

#include "mywrapper-for-tpp.h"

#include <string.h> // for the strlen() function call
#define ERROR_HANDLING return(0)

int main()
{

#if !TPP_CONFIG_ONELEXER
    TPPLexer_Current = (struct TPPLexer*)malloc(sizeof(struct TPPLexer));
    if(!TPPLexer_Current)
        ERROR_HANDLING;
#endif
    if(!TPPLexer_Init(TPPLexer_Current))
        ERROR_HANDLING;

    /* Configure `TPPLexer_Current' to your liking */
    // TPPLexer_Current->l_flags = ...;
    // TPPLexer_Current->l_extokens = ...;

    /* -DMY_MACRO=42 */
    char const* name = "MY_MACRO";
    char const* val = "42";
    if(!TPPLexer_Define(name, strlen(name), val, strlen(val), TPPLEXER_DEFINE_FLAG_NONE))
        ERROR_HANDLING;

    /* -I/usr/include */
    char* incpath = strdup("F:/code/test-prep/tpp/test");
    if(!TPPLexer_AddIncludePath(incpath, strlen(incpath)))
        ERROR_HANDLING;
    free(incpath);

    char* inputFilename = "input.c";

    /* Push an initial file onto the #include-stack */
    struct TPPFile* file = TPPLexer_OpenFile(
                               TPPLEXER_OPENFILE_MODE_NORMAL | TPPLEXER_OPENFILE_FLAG_CONSTNAME,
                               inputFilename, strlen(inputFilename), NULL);
    if(!file)
        ERROR_HANDLING;
    TPPLexer_PushFileInherited(file);

    /* Process input one token at a time.
     * Hint: emission of certain tokens depends on `TPPLEXER_FLAG_WANT*' and `TPPLEXER_TOKEN_*' */
    while(TPPLexer_Yield() > 0)
    {
        int id = TPPLexer_Current->l_token.t_id;
        char* tokstr = TPPLexer_Current->l_token.t_begin;
        size_t toklen = (size_t)(TPPLexer_Current->l_token.t_end - tokstr);
        printf("token: %d: '%.*s'\n", id, (int)toklen, tokstr);
    }

    /* Check if something went wrong (stuff like `#error' directives, or syntax errors) */
    if((TPPLexer_Current->l_flags & TPPLEXER_FLAG_ERROR) ||
            (TPPLexer_Current->l_errorcount != 0))
        ERROR_HANDLING;

    /* Cleanup the lexer (must be called after successful a `TPPLexer_Init()') */
    TPPLexer_Quit(TPPLexer_Current);
#if !TPP_CONFIG_ONELEXER
    free(TPPLexer_Current);
#endif

    cout << "Hello world!" << endl;
    return 0;
}

The content of the "input.c" is very simple, i just wrote:

#define HI 5
printf(HI);

I see that the tokens were printed in the console window:

token: 801: 'printf'
token: 40: '('
token: 48: '5'
token: 41: ')'
token: 59: ';'

The call stack are below:

#0  0x00007fff8678f2d3 in ntdll!RtlIsZeroMemory () from C:\WINDOWS\SYSTEM32\ntdll.dll
#1  0x00007fff86798092 in ntdll!RtlpNtSetValueKey () from C:\WINDOWS\SYSTEM32\ntdll.dll
#2  0x00007fff8679837a in ntdll!RtlpNtSetValueKey () from C:\WINDOWS\SYSTEM32\ntdll.dll
#3  0x00007fff8679e001 in ntdll!RtlpNtSetValueKey () from C:\WINDOWS\SYSTEM32\ntdll.dll
#4  0x00007fff866b6625 in ntdll!RtlGetCurrentServiceSessionId () from C:\WINDOWS\SYSTEM32\ntdll.dll
#5  0x00007fff866b5b74 in ntdll!RtlGetCurrentServiceSessionId () from C:\WINDOWS\SYSTEM32\ntdll.dll
#6  0x00007fff866b47b1 in ntdll!RtlFreeHeap () from C:\WINDOWS\SYSTEM32\ntdll.dll
#7  0x00007fff8678943a in ntdll!RtlRegisterSecureMemoryCacheCallback () from C:\WINDOWS\SYSTEM32\ntdll.dll
#8  0x00007fff866b5cc1 in ntdll!RtlGetCurrentServiceSessionId () from C:\WINDOWS\SYSTEM32\ntdll.dll
#9  0x00007fff866b5b74 in ntdll!RtlGetCurrentServiceSessionId () from C:\WINDOWS\SYSTEM32\ntdll.dll
#10 0x00007fff866b47b1 in ntdll!RtlFreeHeap () from C:\WINDOWS\SYSTEM32\ntdll.dll
#11 0x00007fff84fa9c9c in msvcrt!free () from C:\WINDOWS\System32\msvcrt.dll
#12 0x00007ff7f7e82ef3 in TPPFile_Destroy (self=0x1b4600) at F:\code\test-prep\tpp\src\tpp.c:2087
#13 0x00007ff7f7e8cc4d in cleanup_keyword (self=0x1b4590) at F:\code\test-prep\tpp\src\tpp.c:5514
#14 0x00007ff7f7e8cf45 in destroy_keyword_map (self=0x7ff7f7ec6098 <TPPLexer_Global+88>) at F:\code\test-prep\tpp\src\tpp.c:5563
#15 0x00007ff7f7e8e765 in TPPLexer_Quit (self=0x7ff7f7ec6040 <TPPLexer_Global>) at F:\code\test-prep\tpp\src\tpp.c:6268
#16 0x00007ff7f7e8168c in main () at F:\code\test-prep\main.cpp:64

It crashes in the function:

https://github.com/GrieferAtWork/tpp/blob/1ca163d5413d8c0e9323ca0e5705aec8f32ce956/src/tpp.c#L2087

Any ideas?

Thanks.

GrieferAtWork commented 7 months ago

Any ideas?

Honestly: no. Because I can't replicate it. If I just take your source code, remove the c++ stuff and add <stdio.h> at the top and replace "mywrapper-for-tpp.h" with "tpp.h", it works, and there are no crashes or memory leaks. The only idea I might have would be that it's something you're doing in your "mywrapper-for-tpp.h" (which is the one file you didn't post).

Greetings, GrieferAtWork


For reference, to test your problem I replaced the contents of "fontend.c" with the following and then created a file input.c with the contents you listed:

#include "tpp.h"

#include <stdio.h>
#include <string.h>
#define ERROR_HANDLING return(0)

int main()
{

#if !TPP_CONFIG_ONELEXER
    TPPLexer_Current = (struct TPPLexer*)malloc(sizeof(struct TPPLexer));
    if(!TPPLexer_Current)
        ERROR_HANDLING;
#endif
    if(!TPPLexer_Init(TPPLexer_Current))
        ERROR_HANDLING;

    /* Configure `TPPLexer_Current' to your liking */
    // TPPLexer_Current->l_flags = ...;
    // TPPLexer_Current->l_extokens = ...;

    /* -DMY_MACRO=42 */
    char const* name = "MY_MACRO";
    char const* val = "42";
    if(!TPPLexer_Define(name, strlen(name), val, strlen(val), TPPLEXER_DEFINE_FLAG_NONE))
        ERROR_HANDLING;

    /* -I/usr/include */
    char* incpath = strdup("F:/code/test-prep/tpp/test");
    if(!TPPLexer_AddIncludePath(incpath, strlen(incpath)))
        ERROR_HANDLING;
    free(incpath);

    char* inputFilename = "input.c";

    /* Push an initial file onto the #include-stack */
    struct TPPFile* file = TPPLexer_OpenFile(
                               TPPLEXER_OPENFILE_MODE_NORMAL | TPPLEXER_OPENFILE_FLAG_CONSTNAME,
                               inputFilename, strlen(inputFilename), NULL);
    if(!file)
        ERROR_HANDLING;
    TPPLexer_PushFileInherited(file);

    /* Process input one token at a time.
     * Hint: emission of certain tokens depends on `TPPLEXER_FLAG_WANT*' and `TPPLEXER_TOKEN_*' */
    while(TPPLexer_Yield() > 0)
    {
        int id = TPPLexer_Current->l_token.t_id;
        char* tokstr = TPPLexer_Current->l_token.t_begin;
        size_t toklen = (size_t)(TPPLexer_Current->l_token.t_end - tokstr);
        printf("token: %d: '%.*s'\n", id, (int)toklen, tokstr);
    }

    /* Check if something went wrong (stuff like `#error' directives, or syntax errors) */
    if((TPPLexer_Current->l_flags & TPPLEXER_FLAG_ERROR) ||
            (TPPLexer_Current->l_errorcount != 0))
        ERROR_HANDLING;

    /* Cleanup the lexer (must be called after successful a `TPPLexer_Init()') */
    TPPLexer_Quit(TPPLexer_Current);
#if !TPP_CONFIG_ONELEXER
    free(TPPLexer_Current);
#endif

    return 0;
}

I then compiled with both msvc and gcc (from cygwin; as gcc frontend.c tpp.c && .\a.exe), and it ran without problems.

asmwarrior commented 7 months ago

Here is the file mywrapper-for-tpp.h

#define TPP_USERDEFS "my-custom-tpp-defs.h"
#include "tpp.h"

And here is the file my-custom-tpp-defs.h


/* Custom keywords (but be careful not to re-define ones already defined by TPP)
 * When parsed, you can check for these keywords like:
 * >> switch (TPPLexer_Current->l_token.t_id) {
 * >> case KWD_async:
 * >>     ...;
 * >>     break;
 * >> case KWD_function:
 * >>     ...;
 * >>     break;
 * >> }
 *  */
DEF_K(async)
DEF_K(function)
DEF_K(def)
DEF_K(awesome_keyword)

/* A pre-defined macro (as in `#ifdef __MY_PREDEFINED_MACRO__')
 * This should be used for stuff like `__GNUC__' or `__cplusplus', etc... */
PREDEFINED_MACRO_IF(__MY_PREDEFINED_MACRO__, "42", should_be_defined ? 1 : 0)

/* Custom warnings groups. (see "tpp-defs.inl" for the default groups)
 * Warnings can be enabled/disabled on a per-group basis by parsed text:
 * >> #pragma warning("-Wmygroup")    // Enable
 * >> #pragma warning("-Wno-mygroup") // Disable
 * >> #pragma GCC diagnostic error "-Wmygroup"
 * >> #pragma GCC diagnostic warning "-Wmygroup"
 * >> #pragma GCC diagnostic ignored "-Wmygroup"
 */
WGROUP(WG_MYGROUP, "mygroup", WSTATE_FATAL) // Warnings controlled by "-Wmygroup" / "-Wno-mygroup" / ...

/* Custom warnings.
 * Your compiler would trigger this like (note: it's a varargs function):
 * >> if (!TPPLexer_Warn(W_MYWARNING, "first variable argument"))
 * >>     HANDLE_AS_CRITICAL_ERROR;
 * >> TRY_TO_CONTINUE_COMPILING;
 */
DEF_WARNING(W_MYWARNING, (WG_MYGROUP, WG_SYNTAX), WSTATE_ERROR, {
    char *mesg = ARG(char *);
    WARNF("My warning handler: %s",mesg);
})

/* Custom extension. Input code can enable/disable this by:
 * >> #pragma extension("-fawesome")    // Turn on
 * >> #pragma extension("-fno-awesome") // Turn off
 * Your compiler can check if it's enabled with `TPPLexer_HasExtension(EXT_AWESOME)' */
EXTENSION(EXT_AWESOME, "awesome", enabled_by_default ? 1 : 0)

Those two files are mainly copied from your comments from #1 .

I debugged a while, and I see that the crash happens in this code, see the screen shot of the Code::Blocks IDE under Windows 10 with msys2/mingw64 gcc/g++

image

It is really strange that where is the macro definition "M" come from?

asmwarrior commented 7 months ago

For reference, to test your problem I replaced the contents of "fontend.c" with the following and then created a file input.c with the contents you listed: I then compiled with both msvc and gcc (from cygwin; as gcc frontend.c tpp.c && .\a.exe), and it ran without problems.

I just did the same steps, if you run the tpp.exe file under the command line, you see the printed tokens and the exe got exit, but you don't see whether is is crashed or exit normally.

But if you run gdb debugger, you can see the crash here. Here is the logs:

F:\code\test-prep\tpp>gdb tpp.exe
GNU gdb (GDB) 14.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-w64-mingw32".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from tpp.exe...
(gdb) r
Starting program: F:\code\test-prep\tpp\tpp.exe
[New Thread 1560.0x3c64]
token: 800: 'printf'
token: 40: '('
token: 48: '6'
token: 41: ')'
token: 59: ';'
warning: Critical error detected c0000374

Thread 1 received signal SIGTRAP, Trace/breakpoint trap.
0x00007fff8678f2d3 in ntdll!RtlIsZeroMemory () from C:\WINDOWS\SYSTEM32\ntdll.dll
(gdb) bt
#0  0x00007fff8678f2d3 in ntdll!RtlIsZeroMemory () from C:\WINDOWS\SYSTEM32\ntdll.dll
#1  0x00007fff86798092 in ntdll!RtlpNtSetValueKey () from C:\WINDOWS\SYSTEM32\ntdll.dll
#2  0x00007fff8679837a in ntdll!RtlpNtSetValueKey () from C:\WINDOWS\SYSTEM32\ntdll.dll
#3  0x00007fff8679e001 in ntdll!RtlpNtSetValueKey () from C:\WINDOWS\SYSTEM32\ntdll.dll
#4  0x00007fff866b62b7 in ntdll!RtlGetCurrentServiceSessionId () from C:\WINDOWS\SYSTEM32\ntdll.dll
#5  0x00007fff866b5b74 in ntdll!RtlGetCurrentServiceSessionId () from C:\WINDOWS\SYSTEM32\ntdll.dll
#6  0x00007fff866b47b1 in ntdll!RtlFreeHeap () from C:\WINDOWS\SYSTEM32\ntdll.dll
#7  0x00007fff8678943a in ntdll!RtlRegisterSecureMemoryCacheCallback () from C:\WINDOWS\SYSTEM32\ntdll.dll
#8  0x00007fff866b5cc1 in ntdll!RtlGetCurrentServiceSessionId () from C:\WINDOWS\SYSTEM32\ntdll.dll
#9  0x00007fff866b5b74 in ntdll!RtlGetCurrentServiceSessionId () from C:\WINDOWS\SYSTEM32\ntdll.dll
#10 0x00007fff866b47b1 in ntdll!RtlFreeHeap () from C:\WINDOWS\SYSTEM32\ntdll.dll
#11 0x00007fff84fa9c9c in msvcrt!free () from C:\WINDOWS\System32\msvcrt.dll
#12 0x00007ff76091e837 in TPPLexer_Quit (self=0x7ff760956040 <TPPLexer_Global>) at src/tpp.c:6290
#13 0x00007ff760911678 in main () at src/frontend.c:60
(gdb)

You see, the crash still happens in the same function. Note that I have change the make.sh file to enable the debug information like below:

diff --git a/make.sh b/make.sh
index 04d8225..c4fd658 100644
--- a/make.sh
+++ b/make.sh
@@ -1,6 +1,6 @@
 #!/bin/bash

-CFLAGS=""
+CFLAGS="-g -O0"

 # I told you this is a ~tiny~ preprocessor.
 # >> No dependencies other than libc & only 2 source files.
GrieferAtWork commented 7 months ago

All right: so he first thing I'm seeing is that you're not doing #define TPP_USERDEFS "my-custom-tpp-defs.h" when building tpp.c (because if you did, you'd have gotten some missing-symbol errors).

It is really strange that where is the macro definition "M" come from?

That's actually normal and due to the fact that "k_name" is defined as char k_name[1] (I'm not using flexible arrays for the sake of compatibility with older versions of C). The actual length is "8", as seen in k_size, and the macro here is actually just the result of this part from the definitions (see how "MY_MACRO" has 8 characters and starts with a "M"):

    /* -DMY_MACRO=42 */
    char const* name = "MY_MACRO";
    char const* val = "42";
    if(!TPPLexer_Define(name, strlen(name), val, strlen(val), TPPLEXER_DEFINE_FLAG_NONE))
        ERROR_HANDLING;

But even when I do omit the inclusion of custom defs when compiling "tpp.c", I don't get any crashes. I've added a second (more complex) sample that uses custom TPP defs, but even with that I don't get any crashes, even when I comment out https://github.com/GrieferAtWork/tpp/blob/4feb0ed88968014c27b37c6c8dc743fbbd8b6a67/samples/advanced/mywrapper-for-tpp.c#L1 to get the same situation you're having as a result of not defining tpp defs when building its .c file.

For reference, here are 3 runs, all with GDB and no errors anywhere:

simple:

E:\c\dexmon\deemon\src\tpp\samples\simple>make a.exe gcc -o a.exe main.c ../../src/tpp.c

E:\c\dexmon\deemon\src\tpp\samples\simple>gdb a.exe GNU gdb (GDB) (Cygwin 13.2-1) 13.2 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-cygwin". Type "show configuration" for configuration details. For bug reporting instructions, please see: https://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.

For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from a.exe... (gdb) run Starting program: /cygdrive/e/c/dexmon/deemon/src/tpp/samples/simple/a.exe [New Thread 6664.0x2ea8] [New Thread 6664.0x4a94] [New Thread 6664.0x1360] It's working! token: 806: 'printf' token: 40: '(' token: 34: '"x = %d"' token: 44: ',' token: 48: '10' token: 43: '+' token: 48: '42' token: 43: '+' token: 48: '10' token: 43: '+' token: 48: '42' token: 258: '==' token: 48: '104' token: 41: ')' token: 59: ';' [Thread 6664.0x4ca0 exited with code 0] [Thread 6664.0x2ea8 exited with code 0] [Thread 6664.0x4a94 exited with code 0] [Inferior 1 (process 6664) exited normally] (gdb) quit

advanced (without commenting out the line):

E:\c\dexmon\deemon\src\tpp\samples\advanced>make a.exe gcc -o a.exe main.c mywrapper-for-tpp.c

E:\c\dexmon\deemon\src\tpp\samples\advanced>gdb a.exe GNU gdb (GDB) (Cygwin 13.2-1) 13.2 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-cygwin". Type "show configuration" for configuration details. For bug reporting instructions, please see: https://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.

For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from a.exe... (gdb) run Starting program: /cygdrive/e/c/dexmon/deemon/src/tpp/samples/advanced/a.exe [New Thread 9488.0x3078] [New Thread 9488.0x2644] [New Thread 9488.0x48f0] [New Thread 9488.0x123c] It's working! token: 816: 'printf' token: 40: '(' token: 34: '"x = %d"' token: 44: ',' token: 48: '10' token: 43: '+' token: 48: '42' token: 43: '+' token: 48: '10' token: 43: '+' token: 48: '42' token: 258: '==' token: 48: '104' token: 41: ')' token: 59: ';' token: 817: 'IS' token: 818: 'PREDEFINED' token: 819: 'awesome' token: 820: 'Its' token: 45: '-' token: 814: 'a' token: 45: '-' token: 821: 'builtin' token: 33: '!' [Thread 9488.0x519c exited with code 0] [Thread 9488.0x3078 exited with code 0] [Thread 9488.0x2644 exited with code 0] [Thread 9488.0x48f0 exited with code 0] [Inferior 1 (process 9488) exited normally] (gdb) quit

advanced (with commenting out the line):

E:\c\dexmon\deemon\src\tpp\samples\advanced>make a.exe gcc -o a.exe main.c mywrapper-for-tpp.c

E:\c\dexmon\deemon\src\tpp\samples\advanced>gdb a.exe GNU gdb (GDB) (Cygwin 13.2-1) 13.2 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-cygwin". Type "show configuration" for configuration details. For bug reporting instructions, please see: https://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.

For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from a.exe... (gdb) run Starting program: /cygdrive/e/c/dexmon/deemon/src/tpp/samples/advanced/a.exe [New Thread 11596.0x4884] [New Thread 11596.0x3f78] [New Thread 11596.0x4e6c] [New Thread 11596.0x2c60] It's working! token: 806: 'printf' token: 40: '(' token: 34: '"x = %d"' token: 44: ',' token: 48: '10' token: 43: '+' token: 48: '42' token: 43: '+' token: 48: '10' token: 43: '+' token: 48: '42' token: 258: '==' token: 48: '104' token: 41: ')' token: 59: ';' [Thread 11596.0x3150 exited with code 0] [Thread 11596.0x4e6c exited with code 0] [Thread 11596.0x2c60 exited with code 0] [Thread 11596.0x4884 exited with code 0] [Inferior 1 (process 11596) exited normally] (gdb) quit

Searching for error code C0000374, that's STATUS_HEAP_CORRUPTION, so there is a chance that TPP has a write-past-end-of-allocated-area bug somewhere, and mingw's malloc is just a little stricter than other allocators... (though I've been using tpp in deemon for years, and deemon has been running without problems on lots of systems, so either that's no the issue, or it's in some part of the code that deemon simply doesn't use...)

I'm gonna try to see if I can get mingw working on my machine (used to have troubles with that in the past which is why I'm always just using cygwin), since that's the only difference I'm still seeing here. (although that would be strange since as far as I understand, mingw just uses msvcrt, and running tpp in Visual Studio works for me as well...)

In the meantime, see if you can run the 2 samples I've added to the project. If so, I'd be interested in knowing what's the difference between your test project and the "advanced" sample.

GrieferAtWork commented 7 months ago

Yes: it looks like it is a mingw64 thing! I'm able to replicate (time to debug what's going on here):

E:\c\dexmon\deemon\src\tpp\samples\advanced>"D:\cyg\mingw64\bin\gcc.exe" -o a.exe main.c mywrapper-for-tpp.c

E:\c\dexmon\deemon\src\tpp\samples\advanced>gdb a.exe GNU gdb (GDB) (Cygwin 13.2-1) 13.2 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-cygwin". Type "show configuration" for configuration details. For bug reporting instructions, please see: https://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.

For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from a.exe... (gdb) run Starting program: /cygdrive/e/c/dexmon/deemon/src/tpp/samples/advanced/a.exe [New Thread 6644.0x3568] It's working! token: 814: 'printf' token: 40: '(' token: 34: '"x = %d"' token: 44: ',' token: 48: '10' token: 43: '+' token: 48: '42' token: 43: '+' token: 48: '10' token: 43: '+' token: 48: '42' token: 258: '==' token: 48: '104' token: 41: ')' token: 59: ';' token: 815: 'IS' token: 816: 'PREDEFINED' token: 817: 'awesome' token: 818: 'Its' token: 45: '-' token: 812: 'a' token: 45: '-' token: 819: 'builtin' token: 33: '!' warning: Critical error detected c0000374

Thread 1 received signal SIGTRAP, Trace/breakpoint trap. 0x00007ffd2450de53 in ntdll!RtlIsZeroMemory () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll (gdb) quit A debugging session is active.

    Inferior 1 [process 6644] will be killed.

Quit anyway? (y or n) quit EOF [assumed Y]

GrieferAtWork commented 7 months ago

Fixed in 831d2475a4afbca15e7ee7625896288518b39c56 (and man do I feel dumb now).

There was nothing wrong with tpp itself. It was the sample code that was borken. (urgh...) Problem was that TPPLexer_OpenFile() doesn't return a reference to the file (as can be seen by the fact that said function doesn't have the /*ref*/ comment like e.g. TPPFile_OpenStream() does), but just the file itself (with the reference being lazily cached and owned by the keyword map, which must be done so that #include guards and #pragma once can work correctly).

So by using TPPLexer_PushFileInherited() instead of TPPLexer_PushFile(), there's now 1 reference too few, which then leads to 1 decref->destroy when the lexer stack pops the file, and a second decref (which then writes to already-freed memoy) when the keyword map gets destroyed, which the CRT notices the next time it does a heap check (which I'm guessing happen periodically when heap functions get called in mingw).

There's not really anything I can change to catch this sort of bug more easily. There's already an assert in TPPFile_Decref() to make sure that the reference counter isn't 0, so in this double-free example, the first free must have memset the file's reference counter to some non-zero garbage value (which again: got modified, leading to write-after-free)

For reference, here's how frontend.c does its reference management when it comes to pushing files onto the include stack (which does do it correctly): temp

Also: still no idea why this didn't cause any errors with VS or GCC

TLDR: This was a write-after-free bug, caused by me incorrectly documenting how TPPFile_Open() needs to be used.

asmwarrior commented 7 months ago

Oh, thanks for the quick response and fix.

I will try the latest git version soon.

I was trying to use a memory leak tool to find the bug, but just failed.

asmwarrior commented 7 months ago

I can confirm that change the function call in the sample code in the first comment: https://github.com/GrieferAtWork/tpp/issues/4#issue-2139130702

From TPPLexer_PushFileInherited(file); to TPPLexer_PushFile(file); fixes the crash error.

I'm mainly a C++ developer, so your library is a bit hard to understand for me.

What I'm going to use is to wrap the code here:

    while (TPPLexer_Yield() > 0) {
        int id        = TPPLexer_Current->l_token.t_id;
        char *tokstr  = TPPLexer_Current->l_token.t_begin;
        size_t toklen = (size_t)(TPPLexer_Current->l_token.t_end - tokstr);
        printf("token: %d: '%.*s'\n", id, (int)toklen, tokstr);
    }

To create some C++ Token object for each loop, so I can get a Token stream.

Another question is: can I get the line and column information about the l_token in the main while loop? Thanks.

GrieferAtWork commented 7 months ago

Another question is: can I get the line and column information about the l_token in the main while loop? Thanks.

Yes. See #5

asmwarrior commented 7 months ago

I'm not sure how you catch the bug, but I tried but failed. Later, I did find some way. So, I'd like to share some extra information about how to catch such kinds of crash bugs:

See this comment: https://github.com/ssbssa/heob/issues/31#issuecomment-1951217212

I see the heob's author's method to catch such kinds of crash.