erikbern / git-of-theseus

Analyze how a Git repo grows over time
https://erikbern.com/2016/12/05/the-half-life-of-code.html
Apache License 2.0
2.56k stars 85 forks source link

Looks like doesn't scale well #71

Closed besfahbod closed 4 years ago

besfahbod commented 4 years ago

TL;DR:

 43% (11870586 of 27210192) |###############                    | Elapsed Time: 7 days, 19:07:58 ETA:  40 days, 7:15:53

And, there's no data available after stopping the process, meaning that it will start over on the next run. (So, I think I'm not going to try it again, for now...)

Please let me know if I can provide more info that can help with improving scalability.

maersdal commented 4 years ago

I'm looking at the code, and as of now it does every commit. But maybe it would be faster to sample a subset of the commits, instead of checking each commit?

I mean, for a plot spanning 5 years, I guess it would look ok with 1 sample every week or so...

Only thinking out loud here :)

erikbern commented 4 years ago

This is supported using the --interval flag: https://github.com/erikbern/git-of-theseus/blob/master/git_of_theseus/analyze.py#L194

maersdal commented 4 years ago

This is supported using the --interval flag: https://github.com/erikbern/git-of-theseus/blob/master/git_of_theseus/analyze.py#L194

Ooo. Another example of that time I did not RTFM. Thanks :)

erikbern commented 4 years ago

np it's pretty poorly documented!