Simple Cloud Run service you can configure as a target for GitHub event Webhook to monitor repository (or organization) activity in real-time.
Besides capturing the event throughput metrics in Stackdriver, this service also normalizes the GitHub activity data and stores the results in an easy to query BigQuery table which can be used in Google Sheets or Data Studio.
The current implementation supports following event types:
You can customize this service to support additional event types
Element | Type | Description |
---|---|---|
ID | string | Immutable ID for specific WebHook delivery (important in case of duplicate WebHook submissions) |
Repo | string | Fully-qualified name of the repository (e.g. mchmarny/github-activity-counter ) |
Type | string | The type of GitHUb event (see Events) for complete list |
Actor | string | GitHub username of the user who initialized that event (e.g. PR author vs the PR merger who could be a automation tool like prow) |
EventAt | time | Original event time (not the WebHook processing time, except for push which could include multiple commits) |
If you don't have one already, start by creating new project and configuring Google Cloud SDK. Similarly, if you have not done so already, you will have set up Cloud Run.
To setup this service you will:
To start, clone this repo:
git clone https://github.com/mchmarny/github-activity-counter.git
And navigate into that directory:
cd github-activity-counter
To work properly, the Cloud Run service will require a few dependencies:
eventcounter
]eventcounter.events
]eventcounter
]To create these dependencies run the bin/setup script:
bin/setup
In addition to the above dependencies, the bin/setup
script also create a specific service account which will be used to run Cloud Run service. To ensure that this service is able to do only the intended tasks and nothing more, we are going to configure it with a few explicit roles:
run.invoker
- required to execute Cloud Run servicepubsub.publisher
- required to publish events to Cloud PubSublogging.logWriter
- required for Stackdriver loggingcloudtrace.agent
- required for Stackdriver tracingmonitoring.metricWriter
- required to write custom metrics to StackdriverFinally, the ensure that our service is only accepting data from GitHub, we are going to created a secret that will be shared between GitHub and our service:
export HOOK_SECRET=$(openssl rand -base64 32)
The above
openssl
command creates an opaque string. If for some reason you do not haveopenssl
configured you can just setHOOK_SECRET
to a your own secret. Just don't re-use other secrets or make it too easy to guess.
Cloud Run runs container images. To build one for this service we are going to use the included Dockerfile and submit it along with the source code as a build job to Cloud Build using bin/image script.
You should review each one of the provided scripts for content to understand the individual commands
bin/image
Once you have configured all the service dependencies, we can now deploy your Cloud Run service. To do that run bin/service script:
bin/service
The output of the script will include the URL by which you can access that service.
GitHub has good instructions on how to setup your WebHook. In short it amounts to:
Webhooks
on the left panelAdd WebHook
buttonbin/url
if you can't remember it, make sure you include /v1/github
in the POST target)Edit
under Secret and paste your secret (run echo $HOOK_SECRET
)application/json
as the content typeSend me everything
or select individual events you want to count (see supported events)Active
checkbox checkedAdd Webhook
to save your settingsTo test the setup you can create an issue in the repo where you configured the WebHook. In the WebHook log there should be an indication the WebHook worked (response 200) or didn't.
Similarly on the Cloud Run side, you should be able to see the logs generated by eventcounter
service using the function logs link and eventually there should be now data in the BigQuery table.
There is an endless ways you could analyze this data (e.g. Types of activities per repo or Average activity frequency per user). Here for example is a SQL query for type of activities per user over last 28 days:
SELECT actor, type, count(1) as activities
FROM eventcounter.events
WHERE event_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 28 DAY)
GROUP BY actor, type
ORDER BY 3 desc
You can find a few more query samples in the queries directory.
To cleanup all resources created by this sample execute the bin/cleanup script.
bin/cleanup
This is my personal project and it does not represent my employer. I take no responsibility for issues caused by this code. I do my best to ensure that everything works, but if something goes wrong, my apologies is all you will get.