terryyin / lizard

A simple code complexity analyser without caring about the C/C++ header files or Java imports, supports most of the popular languages.
Other
1.81k stars 246 forks source link

fix: Improve C/C++ preprocessing #369

Open jdehaan opened 1 year ago

jdehaan commented 1 year ago

#if 0 leads to that part to be skipped. This is often found in code to disable sections explicitly.

terryyin commented 1 year ago

Hi @jdehaan this is real cool, thanks. I see the new implementation not only pass the old tests but also works for the 4 new scenarios. Great!

The call method is a bit too long and the CCN is quite high, I would say. Do you mind refactor it a bit?

jdehaan commented 1 year ago

Yes @terryyin I am still a bit unsure if that is really a good idea to skip the preprocessor information, it is code anyway maybe dead code on the one or other side of the #if's and #elif's but in the end this could be the beginning of a better or more thorough handling of such "switchable" code parts. To really know which parts are "on" or "off" could be a long term goal. For the moment you are fully right I will try to simplify the code as much as possible to provide a clear and extensible code for further future improvements.

jdehaan commented 1 year ago

@terryyin I simplified by splitting the code into its essential parts so future extensions can be added by not having to tweak a single spaghetti code central method. There is an intrinsic complexity to the problem itself, though. I think I am close to it. Can you please review again and tell me if that is ok or drive me in some direction you want the code the get to.

jdehaan commented 1 year ago

BTW The AppVeyor CI complains about python 2.7, but honestly is that still a valuable topic to be checked? Python3 is there since a very long time and I see no good reason to forcibly have to support 2.7.

LazaroPirnero commented 1 year ago

It also sounds like there's now a basic preprocessing step in place that can skip or take a branch based on certain conditions. For example, the #if 0 statement can be used to disable certain sections of the code explicitly. That can be really useful when you need to test different parts of your code or disable certain features temporarily. Have you had a chance to try out these changes yet?

jdehaan commented 1 year ago

No not yet, I was not aware of that. I saw that there was an extension for a purpose that sounded like a use case I need and noticed it biased the analysis and is a little bit better with this PR. But in the very end I am unsure how to treat this "correctly".

Slowly I recognize there is code we do not care at all about, dead code (#if 0), that sometimes we really also want a holistic report (notwithstanding if that is actually landing into a compiled binary or not)... Also in the case we collect all the metrics there can be function names duplicated which sometimes matter in the later processing steps (after collecting lizard metrics). So having a good way - beside always varying line numbers - to associate the code with some identifier would be a "must-have".

To address all the use cases, after thinking more about it, it would be good to annotate the functions with the preprocessor conditions leading to it to be "included". Special handling for the #elsif parts is needed, negating the #if and all preceeding #elsifs.. This could lead to a list of conditions that could lead to an sha1 hash that can be used as identifier. Lizard could then spit out an annotations file with the hash and full condition and use that sha1inside the report to be able to "recognize" the functions in a better / more stable manner.

Then it could be up to a post-processing done outside of lizard's scope to filter out unwanted things...

This is a tedious task but might be rewarding. But this is nothing I would like to takle on my own before having a discussion with the author's opinion or what are the things foreseen for lizard'S roadmap regarding these aspects...

@terryyin what do you think?

In the meanwhile I still think this PR provided a slightly improved situation but far beyond of what I would consider being an ideal state... :-/

PS: As the tool supports multiple programming language, we might have to consider how the activation/deactivation of lines for a translation unit in C/C++ via preprocessor switches applies maybe to other programming languages as well...