MatrixAI / Polykey

Polykey Core Library
https://polykey.com
GNU General Public License v3.0
29 stars 4 forks source link

Polykey Bot on GitHub and GitLab to present a FE to the BE's PK CI/CD usage #375

Open CMCDragonkai opened 2 years ago

CMCDragonkai commented 2 years ago

Is your feature request related to a problem? Please describe.

One of our major usecases is integrating PK into CI/CD. This would be configured as as pull-based configuration from the CI/CD to the PK SoT. It could also be push, but pull-based is the recommended way as per https://docs.gitlab.com/ee/ci/secrets/.

Upon doing so, this CI/CD process can trigger a PK bot to provide secret-usage to the PR or MR that started the process. This bot can also provide information on specific commands too similar to slash commands except you can @mention it.

The purpose of this bot can help provide debugging into secret token usage, visibility into secret usage and auditing on secret usage. It can help developers know which secrets are used, when and how. It can also provide information if secret tokens are about to expired and must be refreshed.

The bot also doubles as a marketing tool. Just look at the codesee bot that is free for open source projects. It allows anybody to see that it is being used by that organisation, and it is a simple word of mouth (or word of bot?). https://github.com/MatrixAI/TypeScript-Demo-Lib-Native/pull/28#issuecomment-1136696986

image

The bots are basically free advertising. But it's of course useful (and can still be used internally).

Describe the solution you'd like

Such bots would have to be either OAuth apps or "GitHub Apps". Similar concepts exist in gitlab, but github is more popular.

Bots have to be triggered from the project itself. The way codesee works it that creates a github action file, that when commits work, it will download O/S code to execute on the source code, then send the finished generated diagram/data to codesee. Or the diagram creation is done on the codesee server. Either way, this ends up triggering a comment on the PR (which could be done on the action CICD or on codesee servers).

As you can see here: https://github.com/MatrixAI/TypeScript-Demo-Lib-Native/blob/staging/.github/workflows/codesee-arch-diagram.yml, the map generation occurs on the pipeline but the presentation of the map and generation of thumbnails or smaller pngs is on the codesee server.

With respect to PK, this can be done in a decentralised manner. The PK server itself can "act" as the bot if information is sent to it. In fact we already have an OAuth app on GitHub solely for gestalt identity. It could be extended to also perform these sorts of actions.

Most of the computation would be done on the PK node itself, this would reduce the complexity required to integrate it into the CI/CD pipeline code. Otherwise we would have to give them code to run in their CI/CD and that's just complicated. Since a pull-integration would involve the CI/CD calling the PK node, then the PK node might as well perform the PK bot's duties.

Describe alternatives you've considered

Alternative to decentralisation, is to have this be part of the official polykey.io. This can run centralised and then perform the necessary control of the bot. It may provide us the ability provide additional features and more marketing. Might be easier to setup, because PK nodes are decentralised and can be up or down. One might need to distribute configuration across a PK gestalt.

Additional context

CMCDragonkai commented 2 years ago

We should incorporate:

  1. Unit test reports that go straight to the PR comment.
  2. Benchmark reports that go straight to the PR comment, while also showing how benchmark compares with the previous benchmark - benchmark diff and even more importantly, as this require state, compare against all previous benchmarks - this could rely on the git commit history instead of storing the metrics themselves (but requires the history to maintain the same schema) - https://github.com/MatrixAI/TypeScript-Demo-Lib/issues/54
  3. Triggers directly from comments on the PR, like re-runs, speculatively running jobs... - this may require all jobs to be at the very least manual if not auto executed (but they will show up on the pipelines), but this is stretch goal and has security implications
CMCDragonkai commented 1 year ago

Seems like there are some new services that are primarily accessed over a discord bot or even wechat bot. Then bot-apps are bootstrapped over existing conversational interfaces. We can see that Github/Gitlab issue threads or slack or whatever are conversational interfaces and bots enable integration into that.