Open olivren opened 5 years ago
Not trying to hijack the conversation away from loc, @olivren did you try https://github.com/boyter/scc as a comparison? I belive it handles all these cases as you would expect.
I ask because I keep an eye on all of the counters and try to add any issues into its test suite to make it as accurate as possible.
@boyter I just tried with scc 2.2.0, and it does not handle docstrings at all. I opened an issue about that https://github.com/boyter/scc/issues/62
Errata: I previously said that Tokei ignores docstring comments (and by that I meant it considers it as code). This is in fact the default behavior, but Tokei has a configuration that triggers the correct behavior of counting all docstrings as comments (treat_doc_strings_as_comments = true
in tokei.toml).
I tried this tool for the very first time, to count the number of lines of code of a Python project. The numbers it reports are shockingly inaccurate. It reports a correct number of total lines and blank lines, but it over-counts the number of comments.
I investigated a bit, and I found a simple example that reports 6 lines of comments and 0 lines of code:
So, loc correctly tries to match the docstring delimited by 3 simple quotes, and ends up matching the whole file.
Additional notes about Python comments
In Python,
'''hello'''
and"""hello"""
are string literals, but they are considered a docstring comment only if they appear at the top level of the file, or in a class or function definition. A good heuristic to tell them apart is to count only the triple-quoted string literals that start at the beginning of a line (not counting the blanks).Here is another example where loc counts 2 lines of comment and 1 line of code:
And another one that counts 6 lines of code:
For what is worth, tokei is not better as it ignores docstring comments entirely (which is a very poor choice in my opinion).