kunaltyagi / nsiqcppstyle

Cpp style checker in python
GNU General Public License v2.0
26 stars 24 forks source link

Provide a tool to show which callback functions are invoked for given source code #34

Closed mosherubin closed 2 years ago

mosherubin commented 2 years ago

The request

NsiqCppStyle should provide a convenient and easy-to-use tool to show precisely which rule callback functions will be invoked for a given sequence of source code.

Background

Writing an NsiqCppStyle rule requires understanding which callback functions will be invoked when the engine is processing a known sequence of source code. Currently, knowing which callback functions to take advantage of can be tedious. In addition, a rule writer may not be aware of which callback functions are actually invoked, often missing an elegant method of implementing a rule.

The request is to write a tool that, given a snippet of source code, shows the rule writer the precise set of callback functions invoked, and their order. Each callback function displayed should provide important information such as the token type and value, any existing context stack, and other valuable data passed via the callback function.

The tool is written in the style of an NsiqCppStyle rule file, hooking in to the callback function system.

This tool is similar to Clang's ast-dump command line option.

Example

Suppose we want to see what callback functions NsiqCppStyle will invoke for the following C++ source code:

// Your First C++ Program

#include <iostream>

int main() {
    std::cout << "Hello World!";
    return 0;
}

Here is what the above tool's output might look like:

D:\nsiqcppstyle>python nsiqcppstyle.py -f rules\filefilter.txt "D:\Junk\NsiqCppStyle\hello-world.cpp"
nsiqcppstyle: N'SIQ Cpp Style ver 0.3.0.1

======================================================================================
=  Analyzing hello-world.cpp
======================================================================================
  -  RULE_50_99_A_test_nsiqcppstyle_callbacks is applied.
======================================================================================
LexToken contain six fields: (type, value, lineno, column, lexpos, inactive, pp)

SessionStartRule()
Filter Scope "default" is applied.
Current Filter Setting (Following is applied sequentially)
  1. \.cvs\ is excluded
  2. \.svn\ is excluded

Current File extension and Language Settings
  C/C++=cpp,mm,cc,c,m,h,hpp,cxx,hxx,hh

======================================================================================
Processing:  D:\Junk\NsiqCppStyle\hello-world.cpp
FileStartRule(lexer, filename='hello-world.cpp', dirname='D:\Junk\NsiqCppStyle')
--------------------------------------------------
LineRule(lexer, line='// Your First C++ Program', lineNumber=1)
CommentRule(lexer, token=LexToken(CPPCOMMENT,'// Your First C++ Program',1,1,0, False, None))
--------------------------------------------------
LineRule(lexer, line='#include <iostream>', lineNumber=3)
PreprocessRule(lexer,
               contextStack,
               token=LexToken(PREPROCESSOR,'#include',3,1,27, False, True)
PreprocessRule(lexer,
               contextStack,
               token=LexToken(LT,'<',3,10,36, False, True)
PreprocessRule(lexer,
               contextStack,
               token=LexToken(ID,'iostream',3,11,37, False, True)
PreprocessRule(lexer,
               contextStack,
               token=LexToken(GT,'>',3,19,45, False, True)
--------------------------------------------------
LineRule(lexer, line='int main() {', lineNumber=5)
Rule(lexer,
     contextStack,
     token=LexToken(INT,'int',5,1,48, False, None))
FunctionNameRule(lexer,
                 fullName='main',
                 decl='False',
                 contextStack,
                 context='FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)')
Rule(lexer,
     contextStack,
     token=LexToken(FUNCTION,'main',5,5,52, False, None))
Rule(lexer,
     contextStack,
     token=LexToken(LPAREN,'(',5,9,56, False, None))
     Context stack:
         PARENBLOCK, , 56, 57
Rule(lexer,
     contextStack,
     token=LexToken(RPAREN,')',5,10,57, False, None))
FunctionScopeRule(lexer, contextStack)
                  Context stack:
                      FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
Rule(lexer,
     contextStack,
     token=LexToken(LBRACE,'{',5,12,59, False, None))
     Context stack:
         FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
--------------------------------------------------
LineRule(lexer, line='    std::cout << "Hello World!";', lineNumber=6)
FunctionScopeRule(lexer, contextStack)
                  Context stack:
                      FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
Rule(lexer,
     contextStack,
     token=LexToken(ID,'std',6,5,65, False, None))
     Context stack:
         FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
FunctionScopeRule(lexer, contextStack)
                  Context stack:
                      FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
Rule(lexer,
     contextStack,
     token=LexToken(DOUBLECOLON,'::',6,8,68, False, None))
     Context stack:
         FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
FunctionScopeRule(lexer, contextStack)
                  Context stack:
                      FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
Rule(lexer,
     contextStack,
     token=LexToken(ID,'cout',6,10,70, False, None))
     Context stack:
         FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
FunctionScopeRule(lexer, contextStack)
                  Context stack:
                      FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
Rule(lexer,
     contextStack,
     token=LexToken(LSHIFT,'<<',6,15,75, False, None))
     Context stack:
         FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
FunctionScopeRule(lexer, contextStack)
                  Context stack:
                      FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
Rule(lexer,
     contextStack,
     token=LexToken(STRING,'"Hello World!"',6,18,78, False, None))
     Context stack:
         FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
FunctionScopeRule(lexer, contextStack)
                  Context stack:
                      FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
Rule(lexer,
     contextStack,
     token=LexToken(SEMI,';',6,32,92, False, None))
     Context stack:
         FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
--------------------------------------------------
LineRule(lexer, line='    return 0;', lineNumber=7)
FunctionScopeRule(lexer, contextStack)
                  Context stack:
                      FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
Rule(lexer,
     contextStack,
     token=LexToken(RETURN,'return',7,5,98, False, None))
     Context stack:
         FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
FunctionScopeRule(lexer, contextStack)
                  Context stack:
                      FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
Rule(lexer,
     contextStack,
     token=LexToken(NUMBER,'0',7,12,105, False, None))
     Context stack:
         FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
FunctionScopeRule(lexer, contextStack)
                  Context stack:
                      FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
Rule(lexer,
     contextStack,
     token=LexToken(SEMI,';',7,13,106, False, None))
     Context stack:
         FUNCTION_BLOCK, main, LexToken(LBRACE,'{',5,12,59, False, None), LexToken(RBRACE,'}',8,1,108, False, None)
--------------------------------------------------
LineRule(lexer, line='}', lineNumber=8)
Rule(lexer,
     contextStack,
     token=LexToken(RBRACE,'}',8,1,108, False, None))
FileEndRule(lexer, filename='hello-world.cpp', dirname='D:\Junk\NsiqCppStyle')
ProjectRule(targetName='D:\Junk\NsiqCppStyle\hello-world.cpp')

=================================== Summary Report ===================================
 ** Total Available Rules     : 54
 ** Total Applied Rules       : 1
 ** Total Violated Rules      : 0
 ** Total Errors Occurs       : 0
 ** Total Analyzed Files      : 1
 ** Total Violated Files Count: 0
 ** Build Quality             : 100.00%

================================ Violated Rule Details ===============================

================================ Violated File Details ===============================
SessionEndRule()

A few comments:

Points to Consider

mosherubin commented 2 years ago

The implementation of this issue is PR #41.

mosherubin commented 2 years ago

Kunal Tyagi's email response (4 July 2022):

To be honest, I've been looking at how to handle the tool as a kind of parse with a bunch of different callbacks which call the logging module.

This would allow users to basically decorate their callbacks and have a verbose output allowing them to debug, control the verbosity, output location and style as well as run their custom rulesets.

Sadly, I've not been able to focus fully on this. The design I've in mind has a few too many moving parts, and I've been a bit busy in general. The customization points are:

  • logging format (what details to log)
  • logging location (stdout, file)
  • logging stye (color, no color, etc.)
  • what to log (how detailed)
  • when to log (pre rule, post rule)
  • how to let users add info to logs (for future reference or debugging where pdb is more complicated than a simple print)

The other questions are mere nomenclature once the big ones are satisfied.

My response (4 July 2022):

Thank you for the PR code review comments re PR #35 - I hope to get to them over the next day or two.

No doubt about it, you are a real toolsmith and software designer! I would be happy to help out with some of the work related to overhauling the callback system, if I can fully understand your advances and elegant design. As you yourself intimate, it will be some time before the internal design can be implemented. In the meantime, please put your ideas in an issue so I can read, comment, and understand what you have in mind. I would be more than happy to help out here.

Until that time, there is a great need for a "-ast-dump"-like option which I would like to push. The questions about locations and names are important ones in the meantime. When your callback design is implemented, we can always change the tool. Can you give me guidelines on how to push the tool code in the meantime?