Open ghost opened 1 year ago
Source of the lexer:
lexer grammar SlakeLexer;
COMMA: ',';
QUESTION: '?';
COLON: ':';
SEMICOLON: ';';
LBRACKET: '[';
RBRACKET: ']';
LBRACE: '{';
RBRACE: '}';
LPARENTHESE: '(';
RPARENTHESE: ')';
AT: '@';
DOT: '.';
VARARG: '...';
OP_ADD: '+';
OP_SUB: '-';
OP_MUL: '*';
OP_DIV: '/';
OP_MOD: '%';
OP_AND: '&';
OP_OR: '|';
OP_XOR: '^';
OP_NOT: '!';
OP_REV: '~';
OP_ASSIGN: '=';
OP_ASSIGN_ADD: '+=';
OP_ASSIGN_SUB: '-=';
OP_ASSIGN_MUL: '*=';
OP_ASSIGN_DIV: '/=';
OP_ASSIGN_MOD: '%=';
OP_ASSIGN_AND: '&=';
OP_ASSIGN_OR: '|=';
OP_ASSIGN_XOR: '^=';
OP_ASSIGN_REV: '~=';
OP_ASSIGN_LSH: '<<=';
OP_ASSIGN_RSH: '>>=';
OP_SWAP: '<=>';
OP_EQ: '==';
OP_NEQ: '!=';
OP_STRICTEQ: '===';
OP_STRICTNEQ: '!==';
OP_LSH: '<<';
OP_RSH: '>>';
OP_LT: '<';
OP_GT: '>';
OP_LTEQ: '<=';
OP_GTEQ: '>=';
OP_LAND: '&&';
OP_LOR: '||';
OP_INC: '++';
OP_DEC: '--';
OP_MATCH: '=>';
OP_WRAP: '->';
OP_SCOPE: '::';
OP_DOLLAR: '$';
KW_ASYNC: 'async';
KW_AWAIT: 'await';
KW_BASE: 'base';
KW_BREAK: 'break';
KW_CASE: 'case';
KW_CATCH: 'catch';
KW_CLASS: 'class';
KW_CONST: 'const';
KW_CONTINUE: 'continue';
KW_DELETE: 'delete';
KW_DEFAULT: 'default';
KW_ELIF: 'elif';
KW_ELSE: 'else';
KW_ENUM: 'enum';
KW_FALSE: 'false';
KW_FN: 'fn';
KW_FOR: 'for';
KW_FINAL: 'final';
KW_FINALLY: 'finally';
KW_IF: 'if';
KW_MODULE: 'module';
KW_NATIVE: 'native';
KW_NEW: 'new';
KW_NULL: 'null';
KW_OVERRIDE: 'override';
KW_OPERATOR: 'operator';
KW_PUB: 'pub';
KW_RETURN: 'return';
KW_STATIC: 'static';
KW_STRUCT: 'struct';
KW_SWITCH: 'switch';
KW_THIS: 'this';
KW_THROW: 'throw';
KW_TIMES: 'times';
KW_TRAIT: 'trait';
KW_TYPEOF: 'typeof';
KW_INTERFACE: 'interface';
KW_TRUE: 'true';
KW_TRY: 'try';
KW_USING: 'using';
KW_VAR: 'var';
KW_WHILE: 'while';
KW_YIELD: 'yield';
TN_I8: 'i8';
TN_I16: 'i16';
TN_I32: 'i32';
TN_I64: 'i64';
TN_ISIZE: 'isize';
TN_U8: 'u8';
TN_U16: 'u16';
TN_U32: 'u32';
TN_U64: 'u64';
TN_USIZE: 'usize';
TN_F32: 'f32';
TN_F64: 'f64';
TN_STRING: 'string';
TN_BOOL: 'bool';
TN_AUTO: 'auto';
TN_VOID: 'void';
TN_ANY: 'any';
L_INT: '0b' [01]+ | '0' [0-9]* | '0x' [0-9]+ | [1-9] [0-9]*;
L_UINT: L_INT [uU];
L_LONG: L_INT [lL];
L_ULONG: L_INT ( [uU][lL] | [lL][uU]);
L_F32: L_F64 [fF];
L_F64: [0-9]+ '.' ([0-9]+)?;
L_STRING: '"' CharSequence? '"';
L_RAWSTRING: '"""' (.)*? '"""';
ID: [a-zA-Z_][a-zA-Z0-9_]*;
fragment CharSequence: Char+;
fragment Char: StringEscape | ~["\\\r\n];
fragment StringEscape: SimpleEscape | OctEscape | HexEscape;
fragment SimpleEscape: '\\' [\\"rnt0];
fragment OctEscape: '\\' OctDigit OctDigit OctDigit;
fragment HexEscape: '\\' HexDigit HexDigit;
fragment OctDigit: [0-7];
fragment HexDigit: [0-9a-fA-F];
WHITESPACE: [ \t\r\n]+ -> skip;
COMMENT_BLK: '/*' .*? '*/' -> skip;
COMMENT_LINE: '//' ~ [\r\n]* -> skip;
and content of the input file:
class Base {
pub i32 data = 0;
operator new(i32 a) {
println("Base Constructed");
}
operator delete() {
println("Base Destructed");
}
}
class Derived(@Base) {
pub i32 data = 0;
operator new(i32 a) {
base.new(a * 2);
println("Derived Constructed");
}
operator delete() {
println("Derived Destructed");
}
pub void printMembers() {
println("Base data: ", base.data);
println("Derived data: ", data);
}
}
pub i32 main() {
@Base a = new @Base(123);
return ++a.data;
}
(Because the parser does not affect the result, the source was not provided)
I have located where the problem originates (with the demo in runtime/Cpp/demo).
According to the log (complete log file is here), blocks allocated by codes from following files were not released correctly and cause memory leaks:
runtime/Cpp/runtime/src/atn/LexerATNSimulator.cpp(192)
runtime/Cpp/runtime/src/atn/LexerATNSimulator.cpp(295)
runtime/Cpp/runtime/src/atn/LexerATNSimulator.cpp(536)
runtime/Cpp/runtime/src/atn/ParserATNSimulator.cpp(299)
runtime/Cpp/runtime/src/atn/ParserATNSimulator.cpp(465)
runtime/Cpp/runtime/src/atn/ParserATNSimulator.cpp(531)
runtime/Cpp/runtime/src/atn/ParserATNSimulator.cpp(618)
runtime/Cpp/runtime/src/atn/ParserATNSimulator.cpp(636)
runtime/Cpp/runtime/src/dfa/DFA.cpp(29)
runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(179)
runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(182)
runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(185)
runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(188)
runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(191)
runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(194)
runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(197)
runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(200)
runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(203)
runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(206)
runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(212)
runtime/Cpp/runtime/src/atn/ATNDeserializationOptions.cpp(17)
runtime/Cpp/runtime/src/atn/LexerMoreAction.cpp(16)
runtime/Cpp/runtime/src/atn/LexerSkipAction.cpp(16)
runtime/Cpp/runtime/src/atn/LexerPopModeAction.cpp(16)
Currently, I have no idea about how to fix it.
I think that you are just seeing the lexer deserialize the tables it uses internally for the DFA. This is allocated once for a lexer instance and I guess that it just isn't explicitly released.
It's not a leak in the sense that it will keep growing, though I suppose in the purest sense, it should be released explicitly. Maybe there is a call in there somewhere that will do that, but probably not.
On Sun, Jun 18, 2023 at 12:08 AM 匚艹 @.***> wrote:
I have located where the problem originates (with the demo in runtime/Cpp/demo https://github.com/antlr/antlr4/tree/dev/runtime/Cpp/demo).
According to the log (complete log file is here https://github.com/antlr/antlr4/files/11779486/dumped_leaks.log), blocks allocated by codes from following files were not released correctly and cause memory leaks:
runtime/Cpp/runtime/src/atn/LexerATNSimulator.cpp(192) runtime/Cpp/runtime/src/atn/LexerATNSimulator.cpp(295) runtime/Cpp/runtime/src/atn/LexerATNSimulator.cpp(536) runtime/Cpp/runtime/src/atn/ParserATNSimulator.cpp(299) runtime/Cpp/runtime/src/atn/ParserATNSimulator.cpp(465) runtime/Cpp/runtime/src/atn/ParserATNSimulator.cpp(531) runtime/Cpp/runtime/src/atn/ParserATNSimulator.cpp(618) runtime/Cpp/runtime/src/atn/ParserATNSimulator.cpp(636) runtime/Cpp/runtime/src/dfa/DFA.cpp(29) runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(179) runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(182) runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(185) runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(188) runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(191) runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(194) runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(197) runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(200) runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(203) runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(206) runtime/Cpp/runtime/src/atn/ATNDeserializer.cpp(212) runtime/Cpp/runtime/src/atn/ATNDeserializationOptions.cpp(17) runtime/Cpp/runtime/src/atn/LexerMoreAction.cpp(16) runtime/Cpp/runtime/src/atn/LexerSkipAction.cpp(16) runtime/Cpp/runtime/src/atn/LexerPopModeAction.cpp(16)
Currently, I have no idea about how to fix it.
— Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/4309#issuecomment-1595795568, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7TMHLPZOVW2XNGUNETCDXLXI57ANCNFSM6AAAAAAZBU427E . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I found that the static data of the lexer and parser were not released correctly (DFA caches are also stored here), they will never be released after the allocation in xxxInitialize
functions (in generated source files of lexer and parser).
So I tried to use unique_ptr instead of raw pointer for them (by modifying the codegen template) and then most of the leak prompts disappeared.
Detected memory leaks!
Dumping objects ->
C:\Users\Pyxherb\Desktop\antlr4\runtime\Cpp\runtime\src\atn\ATNDeserializationOptions.cpp(17) : {434} normal block at 0x000002891574C2C0, 3 bytes long.
Data: < > 00 01 00
Object dump complete.
Now I think most of the prompts was caused by unreleased static data.
This needs to be modified ! I'd like you to revise and submit. Thanks
ATNDeserializationOptions.cpp
const ATNDeserializationOptions& ATNDeserializationOptions::getDefaultOptions() { static const ATNDeserializationOptions const defaultOptions = new ATNDeserializationOptions(); return defaultOptions; }
Fixed memory leaks in ATNDeserializationOptions.
I'm developing a compiler for my language on Visual Studio 2022 with ANTLR 4.13.0 (Flex & Bison previously) and the CRT reports memory leaks after the compiler exited without allocation source information.
Partial outputs:
I have tested for each statement to ensure there is no potential memory leaks in my code and I found that memory leaks appear after the lexer initialized:
It seems like the lexer does not release resources properly during the deallocation, there is also an issue mentioned a similar problem: #4099.