maximilien / ghtrack

A python tool to keep track of GitHub users stats
Apache License 2.0
3 stars 3 forks source link

Update to Github GraphQL API #20

Closed psschwei closed 10 months ago

psschwei commented 1 year ago

The "new" GraphQL API allows pulling commits / PRs by user (and can probably also pull reviews if the right parameters are passed):

https://docs.github.com/en/graphql/reference/objects#user

This should significantly reduce the number of API calls needed, which in theory should also speed up the data pulls.

github-actions[bot] commented 1 year ago

Thanks for opening an issue. Would also love to know how you are or intend to use this project. Just curious.

psschwei commented 1 year ago

Thanks for opening an issue. Would also love to know how you are or intend to use this project. Just curious.

oh, I think you already know :laughing:

psschwei commented 1 year ago

quick example -- pulling PRs:

import os
from gql import gql, Client
from gql.transport.aiohttp import AIOHTTPTransport
import pandas as pd

# Select your transport with a defined url endpoint
transport = AIOHTTPTransport(url="https://api.github.com/graphql", headers={"Authorization": f"bearer {os.getenv('GITHUB_TOKEN')}"})

# Create a GraphQL client using the defined transport
client = Client(transport=transport, fetch_schema_from_transport=True)

# Provide a GraphQL query
query = gql("""
    query {
        viewer {
            pullRequests(first: 100, orderBy: {field: UPDATED_AT, direction: DESC}) {
                nodes {
                    title
                    merged
                    mergedAt
                    repository {
                        nameWithOwner
                    }
                 }
            }
        }
    }
""")

# Execute the query on the transport
result = client.execute(query)
df = pd.DataFrame(result['viewer']['pullRequests']['nodes'])
df = df.join(pd.json_normalize(df['repository'])).drop('repository', axis='columns')
df = df.rename(columns={"nameWithOwner": "repo"})
print(df)

snippet of output:

$ python app.py 
                                               title  merged              mergedAt                                     repo
0                      Test serverless on kubernetes   False                  None     Qiskit-Extensions/quantum-serverless
1          Drop program artifact after job scheduled    True  2023-09-01T15:06:02Z     Qiskit-Extensions/quantum-serverless
2                     Add template for django secret    True  2023-08-31T20:21:15Z     Qiskit-Extensions/quantum-serverless
3            Allow creating single-node kind cluster   False                  None                   chainguard-dev/actions
4                  Add support for docker compose v2   False                  None     testcontainers/testcontainers-python
..                                               ...     ...                   ...                                      ...
95                  Add ray system metrics dashboard    True  2023-03-24T19:13:40Z     Qiskit-Extensions/quantum-serverless
96                Output black diff to stdout for CI    True  2023-03-24T14:48:34Z     Qiskit-Extensions/quantum-serverless
97              Adds client-pkg to the release train    True  2023-02-25T14:50:31Z                          knative/release
98              Add monitoring architecture diagrams    True  2023-03-22T19:55:12Z     Qiskit-Extensions/quantum-serverless
99  Disable e2e test due to issue with action runner    True  2023-03-20T17:12:54Z  knative-extensions/kn-plugin-quickstart

[100 rows x 4 columns]
github-actions[bot] commented 10 months ago

This issue appears to be staled. Please update status and need or risk for it to be archived.