Open sbromberger opened 7 years ago
Yeah, language is possibly a separate issue.
As far as tags, this was difficult at the time I started the site, but should be possible now that GitHub added "Topics" (aka tags aka labels) for repositories: https://github.com/blog/2309-introducing-topics
Repo "language" and "topics" data is now being collected. (And displayed in the Explore tab!) This information could potentially be incorporated into the search functionality.
Some of this work has become with @hauten 's lead
See also: https://github.com/LLNL/llnl.github.io/tree/add-topics
@angfl97 Could you do some data analysis to help us understand the categories already in use?
It may be worth noting that logic for answering some of these questions exists to generate our "word cloud" visualizations at the bottom of the explore page and individual repo pages.
The cloud-generator takes a list of {name: aWord, value: wordCount}
objects, which is what these functions output. They may be worth a look.
I made an Excel workbook with the stats @gonsie asked for.
Here is the link
For those not traversing the link, these topics are mentioned in 4 or more repositories:
I was hoping that we'd get some topics outside of the typical "hpc" stuff, but I guess not. The language tags are sort of interesting:
Language | count |
---|---|
shell | 292 |
python | 252 |
C | 210 |
C++ | 202 |
Makefile | 174 |
CMake | 113 |
HTML | 85 |
But I'm not sure that's immediately useful. There are 13 repos using AWK... maybe digging into the lesser used languages would be cool.
What I do think is actually useful are the repos we are pulling from non-LLNL organizations. The top 5 (most repos) come from:
Some of these projects would be very cool to highlight on their own as they sort of represent a whole ecosystem of interrelated repos. These are also the places where we get the most external interaction.
Would be awesome if more repos had topics. I'd done a couple of inventories over the last year and it's something like <10%. Maybe this can encourage PIs: Our portal (not to mention GitHub) will provide more visibility to repos that have topics.
See https://github.com/LLNL/llnl.github.io/blob/new-home-page/radiuss/README.md for a list of tags on radiuss repos - will aim to use that list & the notes above as starting points for standardizing tags across other LLNL repos
@hauten -- Maybe list our standard tags on https://github.com/LLNL/llnl.github.io/blob/master/about/using-github.md ?
Actually, for the docs, we can start the listing here: https://github.com/LLNL/llnl.github.io/tree/master/categories
Tags are a great way to categorize software repos and they're already built into GitHub repos. It would be great to be able to filter on "tag:foo" (and, incidentally, "language:C", but that's probably another issue).