fzipp / gocyclo

Calculate cyclomatic complexities of functions in Go source code.
BSD 3-Clause "New" or "Revised" License
1.35k stars 80 forks source link

Fix cyclomatic complexity #12

Closed nodirt closed 6 years ago

nodirt commented 8 years ago

https://en.wikipedia.org/wiki/Cyclomatic_complexity has the following definition

The cyclomatic complexity of a section of source code is the number of linearly independent paths within it

The current version of the code does not count linearly independent code paths. Instead it counts branch points. This is different. For example, cyclomatic complexity of

if a {
  if b {
  } else {
  }
} else {
}

and

if a {
} else {
}
if b {
} else {
}

must be different, but current implementation would return same value.

This PR rewrites the computation algorithm.

nodirt commented 8 years ago

Fixes #5

gm42 commented 7 years ago

Merged in my cumulative improvements fork: https://github.com/gm42/gocyclo

tariq1890 commented 7 years ago

@fzipp

ryan0x44 commented 7 years ago

Hey @nodirt, I believe this is incorrect - on the surface thinking about "the number of linearly independent paths" is somewhat ambiguous, but if you try drawing a Control flow graph as mentioned on the page you linked (https://en.wikipedia.org/wiki/Cyclomatic_complexity), you'll see that the two examples you gave produce graphs which have equivalent number of edges and nodes.

For a more concrete demonstration, I've put together a Gist containing a Java Checkstyle Cyclomatic Complexity Example. Provided you have Java installed on your machine, if you download all of those files to one directory and run the run.sh script I provided, it will execute the age-old Checkstyle program against Java examples equivalent to the examples you gave in your PR description. The expected output demonstrates that the way the Checkstyle program calculates Cyclomatic Complexity, both code examples = 3, which is exactly the same values as what I get with your Go examples using the original gocyclo.

nodirt commented 7 years ago

Hey @ryan0x44,

The examples you gave are not equivalent to mine, specifically some of the "if" statements do not have "else" blocks.

Here are the analysis for my examples.

Example 1

Code:

if a {
  if b {} else {}
} else {}

Edges:

  1. start -> a
  2. a -> b
  3. a -> !b
  4. b -> finish
  5. !b -> finish
  6. start -> !a
  7. !a -> finish

Linearly independent paths:

  1. start -> a -> b -> finish
  2. start -> a -> !b -> finish
  3. start -> !a -> finish

Example 2

Code:

if a {} else {}
if b {} else {}

Edges:

  1. start -> a
  2. a -> b
  3. a -> !b
  4. start -> !a
  5. !a -> b
  6. !a -> !b
  7. b -> finish
  8. !b -> finish

Linearly independent paths:

  1. start -> a -> b -> finish
  2. start -> a -> !b -> finish
  3. start -> !a -> b -> finish
  4. start -> !a -> !b -> finish
saniales commented 7 years ago

I think since is searched max complexity it is better to leave old implementation

nodirt commented 7 years ago

Hi @sanieales. I didn’t understand your reasoning.

ryan0x44 commented 7 years ago

Hey @nodirt, they are equivalent, if you add else statements you will get the exact same result - I just omitted them for brevity. The given "Linearly independent paths" examples are not control flow graphs which is what I was suggesting be used to demonstrate, and Cyclomatic complexity isn't calculated by adding the number of linearly independent graphs that way (see the wikipedia page for an explanation and formulas).

gm42 commented 6 years ago

Any conclusion on this? My instinct is that if we look at a similar tool in the Java world hopefully we could find one undiscussed canonical/reference implementation to use here too..

Edit: I was wrong, situation is not that clear in Java world either. After some digging the best I could find is the Metrics Reloaded plugin for IntelliJ, and the relevant code snippet can be studied from here: https://github.com/BasLeijdekkers/MetricsReloaded/blob/master/stockmetrics/src/com/sixrr/stockmetrics/methodCalculators/ComplexityCalculator.java

It seems to be using a method walker approach.

nodirt commented 6 years ago

I've incorrectly assumed cyclomatic complexity is the number of independent execution paths. Wikipedia says:

[cyclomatic complexty] is a lower bound for the number of paths through the control flow graph (CFG).

which implies cyclomatic complexity is only a proxy to the number of code paths.

Java's checkstyle has a separate check NPathsComplexity for this reason

I believe the "number of code paths" metric better represents complexity, e.g. in example2, code readers have to think about the difference between cases (!a && b) and (!a && !b). In example2, they don't have to, there is only one code block for !a, thus the code is simpler. However, this repo is about cyclomatic complexity, so there is nothing to fix.

paulbuis commented 6 years ago

Look at McCabe's original paper on cyclomatic complexity. He showed that counting branch points is enough. Count "if", but not "else" and non-default cases in switch & select. Also short-circuited and/or.

What I'm not clear about is anonymous and inner functions. Do inner functions have separate complexities?