Cyclomatic complexity is calculated wrongly for conjunction operators like &&

tobias-klein commented 8 years ago

Given a function like the following

bool test(bool a, bool b)
{
    return a && b;
}

The cyclomatic complexity should be 1 for a function like that, but lizard calculates 2. Basically logical conjunctions should not be part of the complexity calculation.

terryyin commented 8 years ago

return a && b;

is effectively the same as:

if (a)
    return b;
else
    return false;

Therefore it's CCN should be 2. Do you have any good reason of reference for not counting it?

tobias-klein commented 8 years ago

It's also not how other programs (like SourceMonitor) calculate cyclomatic complexity. The calculated complexity should be representing the complexity of the "control flow graph" taking into account explicit if/then/else blocks, etc. I don't think logical conjunctions are part of that. It would be good if such logical conjunctions could either be included or excluded in the calculation using an option.

terryyin commented 8 years ago

What about

void test(bool a, bool b)
{
      if (a && b) do_something();
}

By McCabe's definition, it should be counted as 3.

https://en.wikipedia.org/wiki/Cyclomatic_complexity#cite_note-mccabe76-3

mehrdad89 commented 8 years ago

@tobias-klein well, you can say that SourceMonitor is wrong in that case. CC should be counted if a loop was interpreted in a given code. @terryyin is right to do so.

tobias-klein commented 8 years ago

Yeah, you're probably right about this. I'm just not used to this kind of calculation based on what I saw before with SourceMonitor. What are other complexity calculation tools doing? Does anybody know? Do they calculate the same result as Lizard in this case?

terryyin commented 8 years ago

As far as I know, there’s very little dispute about &&, || being counted in CCN. More disputed issue is how to count switch/case. I’d like to count the number of “case”s, some like to count the “switch”es, some others like to count “case”s but ignore those without a break.

On 9 Mar 2016, at 6:02 PM, tobias-klein notifications@github.com wrote:

Yeah, you're probably right about this. I'm just not used to this kind of calculation based on what I saw before with SourceMonitor. What are other complexity calculation tools doing? Does anybody know? Do they calculate the same result as Lizard in this case?

— Reply to this email directly or view it on GitHub https://github.com/terryyin/lizard/issues/105#issuecomment-194218471.

mehrdad89 commented 8 years ago

I think I agree with not counting case/switch statements in CC due to the readability of the code and how readability is a much more important factor in maintainability of a code.

terryyin commented 8 years ago

Can you give me an example of switch case?

A truly trivial switch case sometimes can be transformed into a data table, in which there’s no condition at all. If it couldn’t, they could be just as complex as many ifs. E.g. in Python there’s not switch/case, only if, elif, because they are essentially the same.

However I don’t want my “opinion” bias the tool, so I provided an option for the “modified” CCN.

On 9 Mar 2016, at 6:16 PM, Mehrdad Meh notifications@github.com wrote:

I think I agree with not counting case/switch statements in CC due to the readability of the code and how readability is a much more important factor in maintainability of a code.

— Reply to this email directly or view it on GitHub https://github.com/terryyin/lizard/issues/105#issuecomment-194223819.

mehrdad89 commented 8 years ago

Actually, I agree with you that is some cases switch/case statements can be used with the same purpose as if statements. Therefore, it is difficult to know which ones should be considered as one. But, I think experienced developers use switch/case statements when they don't have any other option (or they think it's better to do so) due to the complexity of the requirement of the code. Thus, I agree with not counting the case/switch statement. However, I share the same concern as you do on this issue and I am with the given option in this matter.

bernd-b commented 8 years ago

I just checked the result using the commercial software "Understand" (www.scitools.com). This tool supports three options to calculate the cyclomatic complexity (Cyclomatic, Modified and Strict) and only when using the Strict mode the result is 2, in the two other modes it is 1.

Here's the explanation of the different modes taken from the online help:

Cyclomatic: The average number of independent paths through the functions in this file. The higher this metric the more likely a program unit is to be difficult to test and maintain without error.
Modified: Same as Cyclomatic complexity except that each case statement is not counted; the entire switch statement counts as 1.
Strict: Same as Cyclomatic complexity except that && and || also count as 1.

steffenkoller commented 8 years ago

My understanding was, that compexity is about the code and not the functionality. Therefore it is for sure possible to write the same function in different code with different complexity.

As terryyin demonstrated, there are such options. Nevertheless it is questinable to argue that it is functionally identical to write if (a) return a; else return b;

instead "return (a && b)""
with the conclusion that complexity must be counted the same way as readability of the two cases is quite different.

Finally: What is the usage of a given complexity number? Some statement about "Logical modelling"? Then terryyin was maybe fine. Or readability and maintainablity of code? Then Tobias finding is correct. Somewhere I read, that the idea behind the McCabe calculation is that humans are not able to understand something upon a given complexity. That would lead to the second case. Interesting is, that the definition as such seem to care about control flow graph only. Then we should independent of the coding way come to the same number, but with limited usage regarding an indicater for readable/reusable code.

And what it such a given value worth if there are obviously not even a common understanding about its calculation - several tools come to significant different results?

terryyin commented 8 years ago

@steffenkoller cyclomatic complexity is a measurement about how hard the code can be tested (McCable 1976). From testing's perspective

if (a)
    return b; // the last example was wrong, sorry. Now I fixed it.

and

return (a && b)

are exactly the same.

You are right about that many people claimed that when CCN is too big human cannot easily understand it, including McCabe himself (Arthur, McCabe 1996). The reasoning behind that is it's hard to test (verify) it thoroughly.

Some often used references on this topic:

McCabe, T.J., 1976. A complexity measure. Software Engineering, IEEE Transactions on, (4), pp.308-320.

Arthur H. Watson and Thomas J. McCabe (1996). "Structured Testing: A Testing Methodology Using the Cyclomatic Complexity Metric" (PDF). NIST Special Publication 500-235.

tobias-klein commented 8 years ago

Regarding the comment of @bernd-b and my earlier suggestion: Wouldn't it make sense to make the more "strict" calculation an option similar to the "modified cyclomatic complexity"?

terryyin commented 8 years ago

@tobias-klein yes, I will create another extension for 'none strict' CCN.

tobias-klein commented 8 years ago

@terryyin That's cool, thanks! :+1:

terryyin commented 8 years ago

I've implemented the "non strict" CCN in an extension (for the fainted hearts:-) with the commit https://github.com/terryyin/lizard/commit/2823730ab1cab2e1efc9c9292e3292a8b5b10494

I will include it in a release within two days.

To run the nonstrict mode

lizard -Enonstrict

I will still need to make all the extensions more visible to the users.

I'm going to close this issue. If you guys want more discussion we can reopen it.

terryyin / lizard

Cyclomatic complexity is calculated wrongly for conjunction operators like && #105