Open boyter opened 5 years ago
Thank you for this issue! I do want to update the comparison document with the latest version of all the tools. However I do also want to change how they are compared, as a simple "How long did it take for this program take to finish?" is not an accurate measurement of these tools. As most of the time difference between the different programs is down to one counting more files than the other.
Rightnow I'm thinking of replacing that metric with two different measurements.
I wrote an artificial test a while ago for scc
that was designed to test how each tool works over different directory types with the same file in each one which each tool counted the exact same way. You can find details here https://boyter.org/posts/sloc-cloc-code-performance/ Under the heading “A Fair Benchmark”
I’d love to work with you to get something that seems fair across all tools so the results are read the same way by everyone and not as open to interpretation. If both tokei and scc did this I believe all over tools could follow and we would have a real baseline to work on. Just based on the number of tests I have done I can craft benchmarks that show any of the tools to be fastest for example with little effort.
The 10’s of thousands indeed is what I went with. I never considered a single large file because the average file length across a large project like the linux kernel is ~16000 bytes anyway so it seemed redundant to me, although I could see it being useful for large JSON and XML files perhaps.
The current comparison page https://github.com/XAMPPRocky/tokei/blob/master/COMPARISON.md is a little out of date. Would be good to update it.
Loc is now at version 0.5.0 Scc is now at version 2.3.0 Cloc is now at version 1.8.0
Perhaps polyglot can be added by downloading a binary rather then running from source?