stdlib-js / google-summer-of-code

Google Summer of Code resources.
https://github.com/stdlib-js/stdlib
23 stars 5 forks source link

[Idea]: developer dashboard for tracking ecosystem build failures #4

Open kgryte opened 1 year ago

kgryte commented 1 year ago

Idea

The stdlib project encompasses over 3500 repositories which are orchestrated via a centralized repository. While orchestration largely works as intended, build failures do happen, and quickly detecting and resolving build failures in standalone repositories is critical to prevent downstream breakages and ensure ecosystem integrity.

The goal of this idea is to build a developer dashboard to display in real-time standalone repository build failures. We currently have the backend database which collects build results in real-time; however, we have yet to build a frontend for viewing and analyzing such data.

The expected roadmap is as follows:

Expected Outcomes

stdlib developers will be able to navigate to a webpage and see the build status for all repositories at once.

Involved Software

This will involve building a frontend application and interfacing with a backend for querying a PostgreSQL database. We may want to try more "cutting edge" technology here, such as ESBuild, tailwind, etc.

Prerequisite Knowledge

JavaScript, Node.js, CSS, HTML, JSX.

Difficulty

Intermediate. Requires a fair amount of frontend engineering knowledge and modern frontend application development.

Project Length

175/350 hours. A skilled contributor may be able to execute on this faster. If so, scope could be expanded to include analytics and historical overviews.

Potential Mentors

@kgryte @Planeshifter @steff456

kgryte commented 4 months ago

For an idea of what a developer dashboard could look like, see https://npm.github.io/statusboard/. The idea is similar, but we'd want something specifically tailored to the needs and backend database of stdlib.

Repo: https://github.com/npm/statusboard

kgryte commented 3 months ago

Current database schema, although this is subject change:


CREATE SCHEMA stdlib_github;

Tables

Repository

Table for GitHub repository data.

CREATE TABLE stdlib_github.repository (
    id                    BIGSERIAL,
    owner                 TEXT                        NOT NULL,
    name                  TEXT                        NOT NULL,
    description           TEXT,
    url                   TEXT                        NOT NULL,
    public                SMALLINT                    NOT NULL      DEFAULT 0,
    archived              SMALLINT                    NOT NULL      DEFAULT 0,
    created_at            TIMESTAMP WITH TIME ZONE    NOT NULL,
    updated_at            TIMESTAMP WITH TIME ZONE    NOT NULL,
    archived_at           TIMESTAMP WITH TIME ZONE,
    PRIMARY KEY (id),
    CONSTRAINT unique_repository
        UNIQUE (name,owner)
);

Table for storing repository commits.

CREATE TABLE stdlib_github.commit (
    id                BIGSERIAL,
    repository_id     BIGINT                      NOT NULL,
    user_name         TEXT                        NOT NULL,
    message           TEXT                        NOT NULL,
    url               TEXT                        NOT NULL,
    created_at        TIMESTAMP WITH TIME ZONE    NOT NULL,
    PRIMARY KEY (id),
    CONSTRAINT repository_fk
        FOREIGN KEY (repository_id)
        REFERENCES stdlib_github.repository(id)
        ON DELETE CASCADE
);

Table for storing repository tags.

CREATE TABLE stdlib_github.tag (
    id                BIGSERIAL,
    repository_id     BIGINT                      NOT NULL,
    tag               TEXT                        NOT NULL,
    created_at        TIMESTAMP WITH TIME ZONE    NOT NULL,
    PRIMARY KEY (id),
    CONSTRAINT repository_fk
        FOREIGN KEY (repository_id)
        REFERENCES stdlib_github.repository(id)
        ON DELETE CASCADE
);

Events

Table for storing GitHub event types.

CREATE TABLE stdlib_github.event_type (
    id                SERIAL,
    name              TEXT        UNIQUE NOT NULL,
    PRIMARY KEY (id)
);

Table for storing GitHub event data.

CREATE TABLE stdlib_github.event (
    id                BIGSERIAL,
    repository_id     BIGINT                     NOT NULL,
    event_type_id     INTEGER                    NOT NULL,
    user_name         TEXT                       NOT NULL,
    user_url          TEXT                       NOT NULL,
    user_html_url     TEXT                       NOT NULL,
    created_at        TIMESTAMP WITH TIME ZONE   NOT NULL,
    PRIMARY KEY (id),
    CONSTRAINT repository_fk
        FOREIGN KEY (repository_id)
        REFERENCES stdlib_github.repository(id)
        ON DELETE CASCADE,
    CONSTRAINT event_type_fk
        FOREIGN KEY (event_type_id)
        REFERENCES stdlib_github.event_type(id)
);

Workflows

Table for storing the results of workflow runs.

CREATE TABLE stdlib_github.workflow_run (
    id                BIGINT                     NOT NULL,
    repository_id     BIGINT                     NOT NULL,
    name              TEXT                       NOT NULL,
    event             TEXT                       NOT NULL,
    status            TEXT                       NOT NULL,
    conclusion        TEXT,
    run_number        INTEGER                    NOT NULL,
    run_attempt       INTEGER                    NOT NULL,
    url               TEXT                       NOT NULL,
    html_url          TEXT                       NOT NULL,
    created_at        TIMESTAMP WITH TIME ZONE   NOT NULL,
    updated_at        TIMESTAMP WITH TIME ZONE   NOT NULL,
    PRIMARY KEY (id),
    CONSTRAINT repository_fk
        FOREIGN KEY (repository_id)
        REFERENCES stdlib_github.repository(id)
        ON DELETE CASCADE
);

Table for storing GitHub workflow job data.

CREATE TABLE stdlib_github.workflow_job (
    id                BIGINT                     NOT NULL,
    repository_id     BIGINT                     NOT NULL,
    workflow_run_id   BIGINT                     NOT NULL,
    name              TEXT                       NOT NULL,
    run_url           TEXT                       NOT NULL,
    url               TEXT                       NOT NULL,
    html_url          TEXT                       NOT NULL,
    status            TEXT                       NOT NULL,
    conclusion        TEXT,
    started_at        TIMESTAMP WITH TIME ZONE   NOT NULL,
    completed_at      TIMESTAMP WITH TIME ZONE,
    PRIMARY KEY (id),
    CONSTRAINT repository_fk
        FOREIGN KEY (repository_id)
        REFERENCES stdlib_github.repository(id)
        ON DELETE CASCADE
);

Code Coverage

Table for storing code coverage data for each repository.

CREATE TABLE stdlib_github.coverage (
    id                BIGSERIAL,
    repository_id     BIGINT                      NOT NULL,
    workflow_run_id   BIGINT                      NOT NULL,
    statements        REAL                        NOT NULL,
    lines             REAL                        NOT NULL,
    branches          REAL                        NOT NULL,
    functions         REAL                        NOT NULL,
    created_at        TIMESTAMP WITH TIME ZONE    NOT NULL      DEFAULT NOW(),
    PRIMARY KEY (id),
    CONSTRAINT repository_fk
        FOREIGN KEY (repository_id)
        REFERENCES stdlib_github.repository(id)
        ON DELETE CASCADE,
    CONSTRAINT workflow_run_fk
        FOREIGN KEY (workflow_run_id)
        REFERENCES stdlib_github.workflow_run(id)
);

Publish

Table for storing npm publish data for each repository.

CREATE TABLE stdlib_github.npm_publish (
    id                BIGSERIAL,
    repository_id     BIGINT                      NOT NULL,
    version           TEXT                        NOT NULL,
    node_version      TEXT                        NOT NULL,
    license           TEXT                        NOT NULL,
    tarball_size      BIGINT                      NOT NULL,   
    published_at      TIMESTAMP WITH TIME ZONE    NOT NULL,
    PRIMARY KEY (id),
    CONSTRAINT repository_fk
        FOREIGN KEY (repository_id)
        REFERENCES stdlib_github.repository(id)
        ON DELETE CASCADE
);

Download Counts

Table for storing daily npm download counts for each repository package.

CREATE TABLE stdlib_github.npm_download_count (
    id                BIGSERIAL                   NOT NULL,
    repository_id     BIGINT                      NOT NULL,
    count             BIGINT                      NOT NULL,
    created_at        TIMESTAMP WITH TIME ZONE    NOT NULL,
    PRIMARY KEY (id),
    CONSTRAINT repository_fk
        FOREIGN KEY (repository_id)
        REFERENCES stdlib_github.repository(id)
        ON DELETE CASCADE,
    CONSTRAINT unique_created_at
        UNIQUE (repository_id,created_at)
);

Table for storing rolling 7-day average npm download counts for repository package versions.

CREATE TABLE stdlib_github.npm_rolling_version_download_count (
    id                BIGSERIAL                   NOT NULL,
    repository_id     BIGINT                      NOT NULL,
    version           TEXT                        NOT NULL,
    count             DOUBLE PRECISION            NOT NULL,
    created_at        TIMESTAMP WITH TIME ZONE    NOT NULL,
    PRIMARY KEY (id),
    CONSTRAINT repository_fk
        FOREIGN KEY (repository_id)
        REFERENCES stdlib_github.repository(id)
        ON DELETE CASCADE,
    CONSTRAINT unique_version_created_at
        UNIQUE (repository_id,version,created_at)
);