Top Languages card results are incorrect

JmKanmo commented 2 years ago

I write it in my read.me file like below

but I don't use python language code ... My major used language is java.

So I test it other below site.
http://ionicabizau.github.io/github-profile-languages/?user=JmKanmo

and the result is like it.

Why is this?

MSBivens commented 2 years ago

Having a similar issue where it seems more like mine just hasn't updated in weeks

Zordon1337 commented 2 years ago

same here

nicolasfara commented 2 years ago

Same

amandie-ct commented 2 years ago

Same :( my contribution to private repositories is not being accounted for, even though I added the param to show them.

rickstaa commented 2 years ago

@JmKanmo Thanks for your issue. We are aware of the inaccuracy of the language card. Limitations of the current GraphQL implementation cause it (see #1803 and https://github.com/anuraghazra/github-readme-stats/pull/1122#issuecomment-1152066225 for more information).

Currently, the GraphQL API does not allow us to fetch language results for individual users. It only returns language results for repositories. As a result, the language card is not showing the correct statistics. I created a feature request with GitHub that improves this behaviour. You can show your support at https://github.com/github-community/community/discussions/18230. If enough people show their support, we might be able to improve the language results of github-readme-stats in the future. Additionally, we currently only fetch the first 100 repositories causing the language card to be incorrect (see https://github.com/anuraghazra/github-readme-stats/issues/1852).

I also checked https://ionicabizau.github.io/github-profile-languages to see if it produced better results. That repository used the Github Rest API to fetch all repositories of a user to get the language results. Consequently, since this also includes forks, the results given by that tool will be worse for most user accounts.

You can also show your support for #1732, which slightly improves the language card behaviour by allowing users to scale their language results.

rickstaa commented 2 years ago

To summarize the following things can be done to improve the language card:

Fetch all repositories instead of only the first 100 (see #1852).
Giving users the ability to modify the language card calculation (see #1600).

fsantamaria1 commented 1 year ago

Hey @JmKanmo

Are you using the default language weight?

The top languages card shows the percentage based on the size of the repositories by default.

Also: By default, the language card shows language results only from public repositories. To include languages used in private repositories, you should deploy your own instance using your own GitHub API token.

Algorithm

It uses the following algorithm to calculate the languages percentages on the language card:

ranking_index = (byte_count ^ size_weight) * (repo_count ^ count_weight)

By default, only the byte count is used for determining the languages percentages shown on the language card (i.e. size_weight=1 and count_weight=0). You can, however, use the &size_weight= and &count_weight= options to weight the language usage calculation. The values must be positive real numbers. More details about the algorithm can be found here.

&size_weight=1&count_weight=0 - (default) Orders by byte count.
&size_weight=0.5&count_weight=0.5 - (recommended) Uses both byte and repo count for ranking
&size_weight=0&count_weight=1 - Orders by repo count

In my case, I am using &size_weight=0&count_weight=1 because I usually code in Python, but I have one repository with huge Google Collab notebook that uses 60% of my card if I use the default settings.

JmKanmo commented 1 year ago

Thank you all for your kind replies. I checked the answer and then updated README.MD like below

![Top Langs](https://github-readme-stats.vercel.app/api/top-langs/?username=JmKanmo&size_weight=0.5&count_weight=0.5&layout=compact&theme=dark)](https://github.com/JmKanmo/JmKanmo) </br> </br>

I have python repositories like below.
https://github.com/JmKanmo/UserManagerWebsite https://github.com/JmKanmo/PetService_Web https://github.com/JmKanmo/PythonBasicStudy

If we were to calculate percentage statistics based on byte count, perhaps more code would be counted in the Python repository. At this level, I don't think it's that bad. Thank you for your detailed answer and guide.

rickstaa commented 1 year ago

@JMKanmo, there are still some issues with the language algorithm that require attention:

Currently, the algorithm only considers the first 100 repositories of a user, as documented in this GitHub issue (https://github.com/anuraghazra/github-readme-stats/issues/1852).
Forks and organization repositories are not included in the language calculation, as reported in these two issues (https://github.com/anuraghazra/github-readme-stats/issues/1 and https://github.com/anuraghazra/github-readme-stats/issues/3109).
The algorithm relies on the languages found in a repository rather than the languages a user has actively used, which is discussed in this issue (https://github.com/anuraghazra/github-readme-stats/issues/1801#issuecomment-1176153879).

Addressing the first three points could be accomplished by releasing a GitHub Action, as suggested in this issue (https://github.com/anuraghazra/github-readme-stats/issues/2179). However, the last point requires intervention from GitHub itself, and you can just express your support for this improvement in the GitHub Community Discussions at (https://github.com/orgs/community/discussions/18230). Let's keep this issue open to monitor progress on resolving these problems. 🚀

anuraghazra / github-readme-stats

Top Languages card results are incorrect #1801

Are you using the default language weight?

Algorithm