chaoss / augur

Python library and web service for Open Source Software Health and Sustainability metrics & data collection. You can find our documentation and new contributor information easily here: https://oss-augur.readthedocs.io/en/main/ and learn more about Augur at our website https://augurlabs.io
https://oss-augur.readthedocs.io/en/main/
MIT License
585 stars 844 forks source link

Draft of how to check if we already have messages #2807

Closed sgoggins closed 3 months ago

sgoggins commented 4 months ago

This uses the GraphQL API on GitHub to check if we already have messages for a PR or Issue. Using GraphQL to check this before going and getting all the messages if we already have them will avoid unnecessary REST API call, data marshalling, and database hits.

Notes are below:

https://stackoverflow.com/questions/58678266/is-there-any-quick-way-t…o-get-comment-count-of-all-open-pull-request

Query explorer:

https://docs.github.com/en/graphql/overview/explorer

GraphQL Query to get total PR Comment Count:

{
  repository(name: "material-ui", owner: "mui-org") {
    pullRequest(number: 21214) {
      comments(first: 100) {
        totalCount
      }
    }
  }
}

Graphql for the Issue Comment Count:

{
  repository(name: "deepgnn", owner: "microsoft") {
    issue(number: 193) {
      comments(first: 100) {
        totalCount
      }
    }
  }
}

Graphql for Issue or PR:


query {
  repository(owner: "microsoft", name: "deepgnn") {
    issueOrPullRequest(number: 193) {
      ... on PullRequest {
              comments(first: 100) {
                totalCount
              }
            }
      ... on Issue {
              comments(first: 100) {
                totalCount
              }
      }
    }
  }
}

And corresponding SQL to get the count of comments we have (Example):


SELECT
      x.repo_id,
    comment_url,
    split_part(split_part(comment_url, '/repos/', 2), '/', 1) AS organization,
    split_part(split_part(comment_url, '/repos/', 2), '/', 2) AS repository,
    split_part(split_part(comment_url, '/issues/', 2), '/', 1) AS issue_number,
    theid,
    thenumber,
    COALESCE(COUNT(y.pull_request_id) + COUNT(z.issue_id), 0) AS comments
FROM
(
    SELECT repo_id, pr_comments_url AS comment_url ,pull_request_id AS theid, pr_src_number AS thenumber
    FROM augur_data.pull_requests
    WHERE repo_id=52445 and pr_comments_url = 'https://api.github.com/repos/microsoft/DeepGNN/issues/105/comments'

    UNION

    SELECT repo_id, comments_url as comment_url, issue_id AS theid, gh_issue_number AS thenumber
    FROM augur_data.issues
    WHERE repo_id = 52445 and comments_url = 'https://api.github.com/repos/microsoft/DeepGNN/issues/105/comments'
) x
LEFT JOIN augur_data.pull_request_message_ref y ON x.theid = y.pull_request_id
LEFT JOIN augur_data.issue_message_ref z ON x.theid = z.issue_id
GROUP BY x.repo_id, comment_url, organization, repository, issue_number, theid, thenumber;

Description

This PR fixes #

Notes for Reviewers

Signed commits