kunaltyagi / nsiqcppstyle

Cpp style checker in python
GNU General Public License v2.0
25 stars 24 forks source link

Incorrect starting/ending tokens for FUNCTION_BLOCK context when initialization list has LBRACE/RBRACE #59

Closed mosherubin closed 1 day ago

mosherubin commented 2 weeks ago

The NsiqCppStyle parser incorrectly identifies the starting and ending LBRACE/RBRACE tokens for a FUNCTION_BLOCK when one of the initialization variables uses an LBRACE/RBRACE pair.

Example of the Problem

Here is a minimal snippet of C++ code showing the problem:

1 // Resides in ~/junk/test_class.min.cpp
2 test_class::test_class(int argc_, char* argv_[]) :
3     argc(argc_),
4     _UTP{},
5     _videoEnabled(0)
6 {
7   _properties = NULL;
8 }

Notice that the class initialization list includes _UTP{}, which uses list (or brace) initialization to initialize it to all default values. We'll see that NsiqCppStyle gets confused about the LRACE/RBRACE of the list initialization.

Running this snippet file through trace-callbacks:

cd <nsiqcppstyle root folder>
/usr/bin/python3.11 trace-callbacks.py ~/junk/test_class.min.cpp > ~/junk/test_class.min.cpp.log

results in the following (you can find the entire output attached to this comment):

======================================================================================
Processing:  /homes/mosheru/junk/test_class.min.cpp
FileStart     (lexer, filename='test_class.min.cpp', dirname='/homes/mosheru/junk')
--------------------------------------------------
Line          (lexer, line='// Resides in ~/junk/test_class.min.cpp', lineNumber=1)
Comment       (lexer, token=LexToken(CPPCOMMENT,'// Resides in ~/junk/test_class.min.cpp',1,1,0, False, None))
--------------------------------------------------
Line          (lexer, line='test_class::test_class(int argc_, char* argv_[]) :', lineNumber=2)
Token         (lexer,
               contextStack (empty),
               token=LexToken(ID,'test_class',2,1,40, False, None))
Token         (lexer,
               contextStack (empty),
               token=LexToken(DOUBLECOLON,'::',2,11,50, False, None))
FunctionName  (lexer,
               fullName='test_class::test_class',
               decl='False',
               contextStack (empty),
                   context='FUNCTION_BLOCK, 'test_class::test_class', LexToken(LBRACE,'{',4,9,116, False, None), LexToken(RBRACE,'}',4,10,117, False, None)')

Although the parser correctly recognizes "test_class::test_class" as a FunctionName, its context shows it believes the function's LBRACE and RBRACE are on line 4, column 9 (LBRACE) and 10 (RBRACE). This is, of course, incorrect: the correct function LBRACE/RBRACE are lines 6 and 8, respectively.

The parser continues to get things wrong:

Line          (lexer, line='    _UTP{},', lineNumber=4)
Token         (lexer,
               contextStack (empty),
               token=LexToken(ID,'_UTP',4,5,112, False, None))
FunctionScope (lexer, 
               contextStack (1))
                   FUNCTION_BLOCK, 'test_class::test_class', LexToken(LBRACE,'{',4,9,116, False, None), LexToken(RBRACE,'}',4,10,117, False, None)
Token         (lexer,
               contextStack (1),
                   FUNCTION_BLOCK, 'test_class::test_class', LexToken(LBRACE,'{',4,9,116, False, None), LexToken(RBRACE,'}',4,10,117, False, None)
               token=LexToken(LBRACE,'{',4,9,116, False, None))
Token         (lexer,
               contextStack (empty),
               token=LexToken(RBRACE,'}',4,10,117, False, None))
Token         (lexer,
               contextStack (empty),
               token=LexToken(COMMA,',',4,11,118, False, None))

In the end, the parser interprets "_videoEnabled" to be a function name with LBRACE/RBRACE tokens on lines 6 and 8:

--------------------------------------------------
Line          (lexer, line='    _videoEnabled(0)', lineNumber=5)
FunctionName  (lexer,
               fullName='_videoEnabled',
               decl='False',
               contextStack (empty),
                   context='FUNCTION_BLOCK, '_videoEnabled', LexToken(LBRACE,'{',6,1,141, False, None), LexToken(RBRACE,'}',8,1,165, False, None)')
Token         (lexer,
               contextStack (empty),
               token=LexToken(FUNCTION,'_videoEnabled',5,5,124, False, None))
Token         (lexer,
               contextStack (1),
                   PARENBLOCK, '', 137, 139
               token=LexToken(LPAREN,'(',5,18,137, False, None))
Token         (lexer,
               contextStack (1),
                   PARENBLOCK, '', 137, 139
               token=LexToken(NUMBER,'0',5,19,138, False, None))
Token         (lexer,
               contextStack (empty),
               token=LexToken(RPAREN,')',5,20,139, False, None))
--------------------------------------------------
Line          (lexer, line='{', lineNumber=6)
FunctionScope (lexer, 
               contextStack (1))
                   FUNCTION_BLOCK, '_videoEnabled', LexToken(LBRACE,'{',6,1,141, False, None), LexToken(RBRACE,'}',8,1,165, False, None)
Token         (lexer,
               contextStack (1),
                   FUNCTION_BLOCK, '_videoEnabled', LexToken(LBRACE,'{',6,1,141, False, None), LexToken(RBRACE,'}',8,1,165, False, None)
               token=LexToken(LBRACE,'{',6,1,141, False, None))
--------------------------------------------------

Expectations

The parser should handle brace initializations correctly, accurately identifying the opening and closing braces of the function.

mosherubin commented 2 weeks ago

test_class.min.cpp.log