erikbern / git-of-theseus

Analyze how a Git repo grows over time
https://erikbern.com/2016/12/05/the-half-life-of-code.html
Apache License 2.0
2.56k stars 85 forks source link

Survival going up #16

Closed remram44 closed 7 years ago

remram44 commented 7 years ago

I ran survival_plot.py over ReproZip's 1.0.x branch and got this:

survival_plot

I imagine the graph going back up might be caused by duplication? (there are several packages in the repository which share some code) It's still surprising 😅

erikbern commented 7 years ago

It's because both the numerator and the denominator changes over time. Both of them go to zero. In your case there are only so many commits that reached 2.5 years old versus a lot more commits reached 1 year of age. So if the denominator goes down then the ratio goes up.

Hope this explanation makes sense

remram44 commented 7 years ago

No, I can't wrap my head around this 😳

Is the X axis project time or commit age?

erikbern commented 7 years ago

X axis is commit age

erikbern commented 7 years ago

here's a scenario. project has two commits, one from 2000, one from 2015. today is 2016. the first commit is still present in the code base, the second one was reverted in 2016.

first year: 100% because both commits were in the code base second year: 50% because the second commit was removed third year: 100% because now only the first commit counts (we don't know anything about the second commit, could be reverted in the future)

remram44 commented 7 years ago

I see! (commits that age still present)/(commits that age), it makes sense. Thanks for explaining!

erikbern commented 7 years ago

you got it :)