techgaun / active-forks

Find active github forks of a repo https://git.io/vSnrC
https://techgaun.github.io/active-forks/index.html
2.28k stars 295 forks source link

comits/ahead/behind/status #30

Open floatas opened 4 years ago

floatas commented 4 years ago

Initial implementation for #17 simply do request for each for to fetch information.

techgaun commented 4 years ago

Something is resulting in this error:

DataTables warning: table id=forkTable - Requested unknown parameter '9' for row 14, column 9. For more information about this error, please see http://datatables.net/tn/4

And, there seem to be two status columns.

Also, kind of related & kind of unrelated, finally there's a concern about the api rate limit and how we should handle this. Maybe we allow an input box to provide personal access token. Thoughts?

floatas commented 4 years ago

How did you got that warning ? didn't saw it when testing.

techgaun commented 4 years ago

Thanks @floatas for the update.I will see if I can replicate that again

easoncxz commented 4 years ago

FYI, I tried out this PR locally, and also saw the error techgaun noticed. It came up as an browser native alert box. Unfortunately I'm embarrassed by rate limiting from GitHub before I thought of taking a screenshot or copy down the exact error message. I might try again later.

image

easoncxz commented 4 years ago

Here's a screenshot of the error message:

image

DataTables warning: table id=forkTable - Requested unknown parameter '9' for row 82, column 9. For more information about this error, please see http://datatables.net/tn/4

Seems pretty reliably reproducible, as I got it both times I launched the app. However, it seems that with some sorting going on (I clicked to sort by "ahead" in descending order before I clicked "Find"), I get rate-limited from just one click of the "Find" button and nothing more. Perhaps this should be filed as another issue.

floatas commented 4 years ago

Maybe this is related to rate limiting. Can you share which repository you tried ?

milahu commented 4 years ago

Maybe this is related to rate limiting

maybe ask github to extend their API? so the "fork overview" includes the fork status (ahead, behind, same)

maybe use v4 API with graphQL? con: requires authentication so probably leave this optional

code sample with basic auth:

let username = 'your_github_username'
let password = 'your_github_password'
let headers = new Headers({
  'Authorization': 'Basic '+btoa(username+":"+password), // base64
  'Content-Type': 'application/json',
  'Accept': 'application/json',
})
fetch('https://api.github.com/graphql', {
  method: 'POST',
  headers: headers,
  body: JSON.stringify({
    // get schema introspection
    query: "{__schema{types{name,kind,description,fields{name}}}}"})
})
.then(r => r.json())
.then(data => console.log('data returned:', data))

workaround with v3 API: retry failed requests, like with npm/fetch-retry

var fetch = require('fetch-retry');

fetch(url, {
    // retry on status 403 Forbidden
    retryOn: [403],
    // Exponential backoff
    retryDelay: function(attempt, error, response) {
      return Math.pow(2, attempt) * 1000; // 1000, 2000, 4000
    },
  })
  .then(function(response) {
    return response.json();
  })
  .then(function(json) {
    // do something with the result
    console.log(json);
  });

unknown parameter '9'

same here

original repo is https://github.com/rugantio/fbcrawl with 130 forks

active-forks says "Showing 1 to 10 of 100 entries" but only the first 60 forks have the columns: status, ahead, behind, commits = column index 9, 10, 11, 12

alert message is

DataTables warning: table id=forkTable - Requested unknown parameter '9' for row 60, column 9. For more information about this error, please see http://datatables.net/tn/4

javascript console, same error repeats for 40 forks

Failed to load resource: the server responded with a status of 403 (Forbidden) api.github.com/repos/rugantio/fbcrawl/compare/master...feedmari:master

forks, forks, forks, forks, ....

floatas commented 4 years ago

Even with v4 and authentication you can get ~500 forks per hour due to limitations. And I'm unable to find a way to increase that limit, can't find any payment options to increase limit.

milahu commented 4 years ago

Even with v4 and authentication you can get ~500 forks per hour due to limitations.

in v4 we get 100 forks per query including the fields

we can also get more data on commits, if that helps to compare like author, time, message, additions, deletions, ....

the only downside is, we need authentication

here my graphQL string full code sample in github-repo-list-forks.js tested on node.js

query (
  $repoOwner: String!,
  $repoName: String!,
  $refOrder: RefOrder!,
  $forksPerPage: Int!, # forks per page
  $forksCursor: String, # forks pagination cursor
) {
  repository (
    owner: $repoOwner, # select original repo
    name: $repoName,
  ) {
    nameWithOwner # owner/name
    pushedAt # modify time
    stargazers {
      totalCount # github stars
    }
    forkCount

    refs(refPrefix: "refs/", first: 100) {
      nodes {
        target {
          ... on Commit {
            history(first: 5) {
              totalCount # original commits
    }}}}}

    forks(
      first: $forksPerPage, # select fork repos
      after: $forksCursor,
    ) {
      totalCount
      pageInfo {
        hasNextPage # fork pagination
        endCursor
      }
      edges{
        #cursor # fork cursor
        node{
          ... on Repository {
            nameWithOwner # fork owner/name
            pushedAt # fork modify time
            stargazers {
              totalCount # fork github stars
            }

            refs(refPrefix:"refs/",orderBy:$refOrder,first:1){
              nodes{
                ... on Ref{
                  target{
                    ... on Commit{
                      history(first:10){
                        totalCount # fork commits
}}}}}}}}}}}}
milahu commented 4 years ago

using v3 API we can reduce the number of forks [and follow-up queries] by comparing fork.created_at and fork.pushed_at

when pushed_at < created_at then the fork is "empty" and can be ignored

  Promise.all(data.map(async (fork) => {
    if (fork.pushed_at < fork.created_at) {
      // fork is empty
      return
    }
    fetch(`https://api.github.com/repos/${repo}/compare/master...${fork.owner.login}:master`)
Justinzobel commented 4 years ago

Any progress on this?

milahu commented 4 years ago

no progress.

the github v4 graphql api is not ideal for this scenario cos ....

the fields ahead/behind are not served in the repo metadata, and must be calculated from commit data. TODO post a feature request

batching multiple queries into one request is messy, cos the github graphql server does not support standard query batching, so we need a workaround with field aliases, like query1 query2 etc, to imitate an sql "in" operator - again, a better serverconfig/api would help. also huge queries run into server limits so we need error handling

also i found no way to sort or filter commits on server side, so you need to paginate through all the commits, until you find the "branchoff" commit, previously found by main commits x fork date

Justinzobel commented 4 years ago

Sounds mighty ugly. So many repositories get abandoned, even popular stuff and it's a pain to sort through dozens if not hundreds of forks to find something that works or is maintained.

haimivan commented 3 years ago

Hi,

does anybody contacted the company github respectively Microsoft about how useful techgaun/active-forks is for the community (or how useful it would be if everybody knew about it)?

If they want to deliver best service for their customers/users, they should

haimivan commented 3 years ago

I opened a topic at

https://github.community/t/how-to-draw-attention-for-active-forks-functionality/152515

techgaun commented 3 years ago

Sorry for being away from the issue around this and this PR for such a long time. And, thanks @haimivan for creating topic on gh. I'll also try to re-understand and try to come up with possible solution around this.

RoneoOrg commented 2 years ago

A related implementation is available (Source code)

a GitHub token is asked, then a sortable table with last commit date and a diff is displayed. Click on this diff to see the commit names. The GitHub Quota is displayed too