codetriage / CodeTriage

Discover the best way to get started contributing to Open Source projects
https://www.codetriage.com
MIT License
1.41k stars 369 forks source link

Github description/language not updated after repository is created #489

Open ajvb opened 8 years ago

ajvb commented 8 years ago

CodeTriage lists Lantern (https://www.codetriage.com/getlantern/lantern) as Java instead of Go.

Repo Link: https://github.com/getlantern/lantern

schneems commented 8 years ago

We pull the language straight from GitHub's API

rye commented 8 years ago

GitHub's Linguist library is what is used for the detection. I just checked, and Linguist properly identifies Go as being the primary language used in the project. The API reflects this.

@schneems, the GitHub API says that "Go" is the language. From time to time, repositories will switch primary languages if, for example, vendored files are mistakenly added to the tree or whatnot. It appears that CodeTriage stores the language it receives from GitHub in the database, which might have happened a while ago, and it just needs to be updated.

I would argue that the language that the GitHub API provides is a volatile value and is subject to change as new commits are made to a project and as Linguist recalculates a project's primary language.

schneems commented 8 years ago

I would argue that the language that the GitHub API provides is a volatile value and is subject to change as new commits are made to a project and as Linguist recalculates a project's primary language.

Do you have a proposal?

rye commented 8 years ago

Not exactly. Here's an idea, however:

If a Repo's updated_at DateTime is far enough away from the current time, (say, 6 to 24 hours earlier than the time of request) queue up a UpdateRepoInfoJob for it. Still, the existing language in the database should be returned in case of API errors or latency, but the update should be caused to happen.

I'm unsure exactly the best place to put this, though. My best guess would be in RepoBasedController#find_repo, which would be the method that gets hit upon request.

rye commented 8 years ago

I will submit a PR shortly.

nateberkopec commented 7 years ago

Nevermind my issue, no idea there was already one open.

rye commented 7 years ago

See my comment on #499.

nateberkopec commented 7 years ago

I'm gonna keep the discussion here.

I'm under the impression that updating every repo's information on a fixed interval of hours is a great way to use up API calls and induce a greater burden on GitHub's API than is necessary

Maybe, but putting an update behind a search really doesn't solve the problem either. Searching for a repo isn't the only way repos are seen.

rye commented 7 years ago

Yes, but find_repo doesn't only get called when repos are searched for.

rlgreen91 commented 6 years ago

Was this ever resolved? I just had to perform an override on our repo, https://github.com/EBWiki/EBWiki, and now it shows up correctly as a Ruby repo in Github. However, it has not updated on CodeTriage.

schneems commented 6 years ago

For ebwiki they updated it in github but we do not sync. I manually updated their value. We should look into either periodically refreshing the information, or implementing some kind of a way to trigger a refresh.

schneems commented 6 years ago

For ebwiki they updated it in github but we do not sync. I manually updated their value. We should look into either periodically refreshing the information, or implementing some kind of a way to trigger a refresh.

I manually updated ebiwiki and getlantern/lantern.

We should keep the issue open until we implement a better syncing/refresh feature.

rlgreen91 commented 6 years ago

I think it would be good to have some type of sync button to allow a person to implement pulling data from Github. I think the question is just where on the page you'd want to have it - a message at the top with the button that the user can close if they desire? At the bottom? I'd be interested to know what y'all think.

dometto commented 6 years ago

@schneems I just made linguist ignore some vendored files so that gollum is now correctly displayed as being a ruby project rather than a JS one. Since this issue is still open I assume you still have to manually update this on CodeTriage, or should I do so myself somehow?

/cc @bartkamphorst

dometto commented 5 years ago

ping @schneems