smontanari / code-forensics

A toolset for code analysis and report visualisation
383 stars 45 forks source link

a more useful sum-of-coupling-analysis #39

Open kavhad opened 5 years ago

kavhad commented 5 years ago

The sum of coupling is not very informative if there's a high amount of coupling between a module and it's corresponding unit test. Could this be highlighted in some fashion so that the sum of couplings also gives extra information based on a given file patterns?

For instance if we provide the following patterns: ['src/','test/']

I would expect this output, which is more informative such as the following:

module                |  src/* | test/* | total coupling
/src/moduleA.js   |  1       |  41     | 42
.
.
.
smontanari commented 5 years ago

I agree that sum of coupling numbers coming from analysing an entire repository can be biased by the natural coupling between production code and test code. Your suggestion of distributing the data across different sections of the codebase can be definitely more informative. Unfortunately the sum of coupling analysis, as implemented at the moment, relies pretty much on the output produced by a specific code-maat command, which does not support any sort of partitioning/splitting of the data. A way to approach this problem would be to implement something similar to the layer grouping feature used in the system evolution analysis, where basically the same analysis (sum of coupling in this case) would be run one or multiple times including only certain sections of your code base (that could be defined, as you suggest, through grouping of regular expressions). This is definitely possible, and something that's been in my mind for a while. I will try and find the time to look deeper into it and figure out how much work it can be.

Meanwhile, for instance, what you can do is try and exclude the paths that identify test code in the repository configuration of your gulpfile.js. That would allow you to then run a sum of coupling analysis only containing coupling data between production code files. I know it's not an ideal process because it requires manual configuration changes all the time, but it would give you information more relevant to what you need