Open ajvb opened 8 years ago
We pull the language straight from GitHub's API
GitHub's Linguist library is what is used for the detection. I just checked, and Linguist properly identifies Go as being the primary language used in the project. The API reflects this.
@schneems, the GitHub API says that "Go" is the language. From time to time, repositories will switch primary languages if, for example, vendored files are mistakenly added to the tree or whatnot. It appears that CodeTriage stores the language it receives from GitHub in the database, which might have happened a while ago, and it just needs to be updated.
I would argue that the language that the GitHub API provides is a volatile value and is subject to change as new commits are made to a project and as Linguist recalculates a project's primary language.
I would argue that the language that the GitHub API provides is a volatile value and is subject to change as new commits are made to a project and as Linguist recalculates a project's primary language.
Do you have a proposal?
Not exactly. Here's an idea, however:
If a Repo's updated_at
DateTime is far enough away from the current time, (say, 6 to 24 hours earlier than the time of request) queue up a UpdateRepoInfoJob for it. Still, the existing language in the database should be returned in case of API errors or latency, but the update should be caused to happen.
I'm unsure exactly the best place to put this, though. My best guess would be in RepoBasedController#find_repo
, which would be the method that gets hit upon request.
I will submit a PR shortly.
Nevermind my issue, no idea there was already one open.
See my comment on #499.
I'm gonna keep the discussion here.
I'm under the impression that updating every repo's information on a fixed interval of hours is a great way to use up API calls and induce a greater burden on GitHub's API than is necessary
Maybe, but putting an update behind a search really doesn't solve the problem either. Searching for a repo isn't the only way repos are seen.
Yes, but find_repo
doesn't only get called when repos are searched for.
Was this ever resolved? I just had to perform an override on our repo, https://github.com/EBWiki/EBWiki, and now it shows up correctly as a Ruby repo in Github. However, it has not updated on CodeTriage.
For ebwiki they updated it in github but we do not sync. I manually updated their value. We should look into either periodically refreshing the information, or implementing some kind of a way to trigger a refresh.
For ebwiki they updated it in github but we do not sync. I manually updated their value. We should look into either periodically refreshing the information, or implementing some kind of a way to trigger a refresh.
I manually updated ebiwiki
and getlantern/lantern
.
We should keep the issue open until we implement a better syncing/refresh feature.
I think it would be good to have some type of sync button to allow a person to implement pulling data from Github. I think the question is just where on the page you'd want to have it - a message at the top with the button that the user can close if they desire? At the bottom? I'd be interested to know what y'all think.
@schneems I just made linguist
ignore some vendored files so that gollum is now correctly displayed as being a ruby
project rather than a JS one. Since this issue is still open I assume you still have to manually update this on CodeTriage, or should I do so myself somehow?
/cc @bartkamphorst
ping @schneems
CodeTriage lists Lantern (https://www.codetriage.com/getlantern/lantern) as Java instead of Go.
Repo Link: https://github.com/getlantern/lantern