Open etagwerker opened 4 years ago
Interesting. I've a question though: any idea on what could be this weight number and how it would change? As I see it, maybe it can be as simple as:
Because I think we're dealing with an issue of lack of information: if we want to ascribe a number to capture all the complexities of a file in a snapshot, naturally we can't take churn into account, in which case our skunk core is less precise or maybe less significant due to the lack of information.
It's more like we have one possible formula if we take the code's history into account and another formula if we don't...
One other way to think of this is analogous to the integration of a function, I believe. In math, when you integrate something over a variable, it's as if you were summing the values of the function at many points, ending up with the area under the function's curve and we could consider that our skunk score
I could try and elaborate more, but this would imply a change on how to see the skunk score: it would be the sum of the cost*penalty function over the churn.
What we'd need to define is what does churn mean for a given snapshot in order to know if this makes any sense.
Context
Initially the SkunkScore was calculated as churn cost penalty. This made sense based on the churn vs. complexity idea -> https://www.agileconnection.com/article/getting-empirical-about-refactoring
However, I quickly realized that this formula would not work when running
skunk -b master
-- more here: https://www.fastruby.io/blog/code-quality/escaping-the-tar-pit-at-rubyconf.htmlSo I decided to change the formula to be cost * penalty.
Alternatives
I think a potential solution is to apply a modified weight to churn, so that the formula could look like this:
That way, the formula could work both as a snapshot and as a comparison between two branches.
Test
Testing this should show that removing complexity in a module, git committing, and then running
skunk -b master
produces a lower skunk score.