anmol098 / waka-readme-stats

This GitHub action helps to add cool dev metrics to your github profile Readme
MIT License
3.28k stars 536 forks source link

FEAT: Use .gitattributes when counting LoC #53

Open ajmeese7 opened 4 years ago

ajmeese7 commented 4 years ago

Is your feature request related to a problem? Please describe. The lines of code feature vastly overestimates the lines of code I have written, and it has done the same to many accounts I have seen using this repository.

The issue is that when libraries like jQuery are committed in a repo, they are considered vendored and not factored into the language percentages for a repository. According to Sourcerer, I have written 175k lines of code, but waka-readme-stats says that number is closer to 8 million.

And with the code graph, it has Python greater than every other language on mine, with my only Python repository inflated due to libraries that I have marked as vendored in .gitattributes.

My bar graph

Describe the solution you'd like I would like code libraries to be ignored when counting lines of code and when creating the code graph.

The .gitattributes file is a good starting place for this, but GitHub automatically flags a lot of vendored code libraries as such and ignores them. I don't know what the best way to implement this feature is, but it's something to consider.

hycinth22 commented 1 year ago

I have found similar problems.

I debugged the repository and found that it roughly marks all lines of code changes in commit as LoC for the primary language of the repository. This is sometimes quite inaccurate.

For example, I have a C++ repository including only 10k lines of C++ code with 693k lines of test case input files with 24k lines of Visual Studio configuration files.

The latter two are auto-generated and not in C++, so they should not be counted as lines of code in C++, otherwise my contribution and workload would be greatly overestimated by 72x.