boyter / scc

Sloc, Cloc and Code: scc is a very fast accurate code counter with complexity calculations and COCOMO estimates written in pure Go
MIT License
6.28k stars 250 forks source link

Wrong language Complexity/Lines calculation when using -wide #412

Open RobertZickler opened 8 months ago

RobertZickler commented 8 months ago

Describe the bug

When the output is generated with -wide the Complexity/Lines is calculated. This is okay for one or multiple files. But then the Complexity/Lines is accumulated to sum up to the language Complexity/Lines. But this is not correct.

Example:

─────────────────────────────────────────────────────────────────────────────────────────────────────────────
Language                              Files     Lines   Blanks  Comments     Code Complexity Complexity/Lines
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
C Header                                  3       526       67        71      338         27            44.70
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
~\the\path\to\project\files\header_file_1.h       397       42        43      312         13             4.17
~\the\path\to\project\files\header_file_2.h       107       21        23       63         11            17.46
~\the\path\to\project\files\header_file_3.h        22        4         5       13          3            23.08

The language result seems to be calculated like this:

$$ 4.17\,{Complexity \over Lines} + 17.46 \,{Complexity \over Lines} + 23.08\,{Complexity \over Lines} = 44.70\,{Complexity \over Lines} $$

The problem even extends to the total calculation:

─────────────────────────────────────────────────────────────────────────────────────────────────────────────
Language                              Files     Lines   Blanks  Comments     Code Complexity Complexity/Lines
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
C Header                                  3       526       67        71      338         27            44.70
C                                         3     15081     1824      2779    10478       2567           383.42
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
Total                                     6     15607     1891      2850    10816       2594           428.12

To Reproduce

scc.exe --no-cocomo --no-size --sort complexity --wide header_file_1.h header_file_2.h header_file_3.h

The error is also produced when run without --by-file

Expected behavior

The Complexity/Lines should not be added from each file to produce the language-Complexity/Lines. Instead, it should be calculated on the summed up language Complexity and Lines. In the above example, the correct Complexity/Lines for C Header would be:

$$ {27 \,Complexity \over 338 \,(Code)Lines} = 7.99 \,{Complexity \over Lines} $$

And for the total result:

$$ {2594 \,Complexity \over 10816 \,Lines} = 23.98 \,{Complexity \over Lines} $$

Alternatively: if the described output is by design, the headline Complexity/Lines is wrong for the language and total results. But I could not find any description on the reason why the sum of the file Complexity/Lines is equivalent to the language or total Complexity/Lines.

One more thing: It would be nice to have the option to sort the files by Complexity/Lines. Current options are:

-s, --sort string     column to sort by [files, name, lines, blanks, code, comments, complexity] (default "files")

Desktop: