L7R7 / gitlab-ci-build-statuses

Get the statuses of the CI pipelines in a group in Gitlab CI and display them on an HTML page
Apache License 2.0
10 stars 1 forks source link
ci-pipelines docker gitlab-api gitlab-ci haskell

gitlab-ci-build-statuses

build code stats

Fetch the current statuses of the latest Gitlab CI pipelines for all default branches in a Gitlab group and show them on an HTML page.
Optionally, collect information about the pipeline jobs that are running at the moment and show them grouped by the runner that executes the job.

Features

UI samples

Current build statuses (GET /statuses, GET /statuses?view=plain)

statuses

Current build statuses grouped by subgroups (GET /statuses?view=grouped)

statuses-grouped

Current running jobs (GET /jobs, if the jobs view is enabled)

jobs

Usage

Configuration

The application has to be configured via environment variables. Some of them are mandatory, others are optional:

The app won't start if not all mandatory configuration properties are set. There will be a log message with the details before the application exits.

Run it

The most straightforward way to use this is to run the Docker image that's provided via Docker Hub.

 docker run -p 8282:8282 \
  -e GCB_GITLAB_API_TOKEN=xyz \
  -e GCB_GITLAB_BASE_URL=https://example.gitlab.com \
  -e GCB_GITLAB_GROUP_ID=1 \
  l7r7/gitlab-ci-build-statuses:latest

API

The app exposes the following endpoints:

All endpoints are available under the prefix /builds as well. This is especially helpful when you deploy the app behind something like an ingress proxy where you want to have a clear prefix to do the routing.

Operating showcase

This repository includes a showcase for a docker-compose based deployment in docker-compose. This includes the app itself (you have to configure it in the docker-compose file), a Prometheus and a Grafana including a ready to go setup with some dashboards to demonstrate what the app offers.

Grafana

You'll find the Grafana instance at http://localhost:3000. It includes a dashboard that looks like this:

screencapture-localhost-3000-d-oD7GwCVMk-build-statuses-2021-11-11-15_30_55

FAQ

My Gitlab group has subgroups. Will the pipelines of projects in there be included?

Yes. Projects in subgroups will be included. At the moment the status page will show a flat list of pipeline statuses.

Can I do horizontal scaling by using multiple instances?

Yes. Be aware, though, there's no shared persistence between the instances. Each instance will fetch all the data it needs and will store it in memory. So it's not efficient, but the different instances won't step on each other's toes (up until you reach the point where you're essentially DDoS'ing your Gitlab instance).
I don't think it will be necessary to run more than one instance. In my experience the performance of a single instance and the available possibilities of vertical scaling is more than enough for medium-sized teams with a large number of projects. If you do encounter scalability issues, feel free to open an issue and let me know!

How is it possible for a project to have no default branch?

The corresponding API docs don't say that, but if a project is empty (e.g. if it was just created) it doesn't have a default branch.

Why do you need an extra call to get the single pipeline to determine the build status?

When I built this, I found out that the API is not always ideal for my needs. In my team, we're using jobs that are allowed to fail (e.g. because they require manual steps, or just run some checks that shouldn't break the pipeline). I'd like to make it clear on the status page if a pipeline was successful or if it ended with warnings. However, if a pipeline fails with warnings the status that is returned by the List Project Pipelines endpoint will be success. To get the exact status, a request to the Single Pipeline Endpoint is necessary for a pipeline that seems to be successful.
There are two open issues that address this inconsistency in the API (see here and here).

Isn't this thing slightly over-engineered?

Yes. It is.
To be honest, the intention of this project never was to build something that's the ideal, minimal solution for the problem. I was looking for something to build with Haskell to see if I had learned enough to get this working. I learned a lot while building this, and that's what is important to me in this case.
More on the reason behind the technical details below.

Technical considerations

Here are the answers to the questions you might not have known you wanted to ask about the technical details of this project. You won't need to know any of that to use this app.

Persistence

I went with the simplest way of persisting the build statuses I could think of: It's just an IORef that stores the information in memory.
I wanted to have a persistence to avoid having to call the GitLab API whenever somebody wants to see the current pipeline statuses. In addition to that, it was important to me to keep it self-contained, so something like a proper database was no option here (besides maybe SQLite).

Frontend

My initial goal here was to have a lightweight UI that works without any JS. The only real challenge was to solve the automatic refreshing, which I was able to solve without any JavaScript by using a meta HTML tag. I like this solution because it couldn't be any simpler.

The only place that uses JS is the conversion of the timestamp of the last update into the user's browser timezone. I was tired of converting from UTC in my head, so I added the conversion. There was no other way to do that conversion with reasonable effort, so I went for the eleven lines of JS. Fortunately, the UI is still fully functional with JS disabled, as the timestamp will be shown in UTC.

polysemy

Polysemy is one of the effect system implementations in Haskell. It describes itself as being "a library for writing high-power, low-boilerplate domain specific languages" based on "Higher-order, low-boilerplate free monads". This project is the one where I first tried polysemy in a way that's beyond a simple example. I think it's a cool way to abstract over (side) effects in your program, and it helps to switch implementations for testing (e.g. to provide a fixed clock in tests and a "real" one in production). In addition to that, I find it makes it easier to see what a function does: If it doesn't have a dependency on the Time effect, it can't access the clock.

If you're interested in why polysemy might be a good idea, I can recommend this talk as a starter.

Do we need that here? Probably not. The complexity of this application is low enough to manage it without the help of an effect system. Also, one needs to be aware that polysemy comes at a cost: For most people, it's a rather steep learning curve and there is a number of people that say it's not worth the effort.

servant

Servant is one of the most fascinating libraries in Haskell. The idea is to have a type-safe web API using some advanced features of the Haskell type system. While I wouldn't consider myself an expert in Haskell's type system, I found this library to be well-documented and easy-to-use, at least for the simple API in this project.
Since the API is described as a Haskell type, it's well-structured and allows for very interesting things like automated generation of client functions, documentation, OpenAPI spec definitions and mock servers. This makes the library very interesting and made me want to try it.

Do we need that here? The API is probably simple enough to stay on top of it without the help of a type-level API definition. Nevertheless, it's not too much overhead to argue that it's overkill. I'd choose it again, even for simple projects.

Hexagonal Architecture

I'm a big fan of this architecture pattern, and I think it works especially well combined with pure functional programming, as described here by Mark Seemann. I wanted to give it a try and implement this in Haskell. My implementation is inspired by this repository, which also has a nice explanation of how it connects to polysemy.

Do we need that here? No, absolutely not. In a more real world scenario (read: a scenario where a team of people is working on the codebase and is required to make money with it) I would argue that it's overkill and a more simple architecture is the way to go because there's not a lot of domain logic going on that is worth protecting.
In hindsight, it wasn't a good idea and not a good example to demonstrate the power of hexagonal architecture. I might do a refactoring to a simpler approach at some point in the future.

logging

I was looking for a way to implement structured logging with JSON output and ended up using katip with a self written effect and interpretation. Ideally, there would be a library that provides all that functionality out of the box, but I haven't found anything that was a good fit for me.

Do we need that here? Yes. Logging is important, and structured logging allows me to add more information to the logs that help understand what's going on.

Configuration parsing using higher kinded data types

I first heard of this concept in this talk by Chris Penner. He also wrote a blog post about the idea, which I used as a starting point for my own implementation. I dedicated a whole thread on Twitter describing my solution.

Do we need that here? I'm not really sure. This approach really is fascinating, but it also comes at a cost. Thinking in terms of higher-kinded datatypes and working with higgledy/barbies means a comparatively steep learning curve.
On the other hand, this approach has some nice advantages:

Now, in this particular case here I think it's a nice solution for config parsing, but not the most efficient one. A more efficient approach could be to just use envy.