markfink / metrics

produces metrics for Python, C, C++, Go and Javascript programs (plugins for pylint, pytest-cov, and git available)
MIT License
16 stars 8 forks source link

McCabe summary is a TOTAL per language, should it not be average per file? #9

Open mcallaghan-bsm opened 5 years ago

mcallaghan-bsm commented 5 years ago

When we run metrics on a project with many files, the resulting McCabe (cyclomatic complexity) yields a bit of a false picture to the consumer of the metric.

Currently the tool takes the SUM of all complexities per file, and shows that. This value is nearly meaningless without taking the average per file, and/or having histogram min/max etc. broken down by function/method.

Reading: https://www.guru99.com/cyclomatic-complexity.html

Consider the following forged result:

/usr/local/bin/pipenv run metrics --format=csv *.py

Metrics Summary:
Files                       Language        SLOC Comment McCabe 
----- ------------------------------ ----------- ------- ------ 
   14                         Python        3116     816    100 
----- ------------------------------ ----------- ------- ------ 
   14                          Total        3116     816    100 

(SNIP raw per-file output for now)

The TOTAL McCabe is 100! way over the industry standard recommended <10. However ... if we dig more, we can see that it's actually the summation of all the files' complexities.

I don't know if this was done on purpose, or an arbitrary decision to inherit total.

Would like to propose that McCabe is an average complexity per file (total_mccabe / total_files) per language in the summary output.

Not sure if backwards compatibility is of concern however ... (it might be), in which case we'd have to introduce an entirely new column.

(more interesting bits could be pulled out by using histograms, min/max, etc - but arguably that's beyond the scope of this issue)

mcallaghan-bsm commented 5 years ago

@markfink would you be open to a PR that fixes this?

mcallaghan-bsm commented 5 years ago

This is roughly what I was thinking (very rough, and only a PoC)

https://github.com/mcallaghan-bsm/metrics/blob/enhancement_mccabe_avg/metrics/mccabe_avg.py