crate / cratedb-toolkit

CrateDB Toolkit, an SDK for CrateDB and CrateDB Cloud.
https://cratedb-toolkit.readthedocs.io/
GNU Affero General Public License v3.0
7 stars 3 forks source link

Build self-contained native binaries #160

Open amotl opened 3 months ago

amotl commented 3 months ago

About

We might think about using traditional PyInstaller to build self-contained native binaries? Alternatively, let's try Briefcase, or PyApp?

ctk.exe, anyone?

References

amotl commented 2 months ago

Introduction

We discussed matters of how to derive subsets of functionality in CrateDB Toolkit into dedicated release artefacts, which not necessarily need to follow the development iterations and release cadence of Toolkit.

Most prominently, this use case appeared with GH-88 and GH-153. We are sharing our thoughts here, about our first approach to that topic on behalf of monorepo paradigms.

Proposal

This is our first proposal, based on preliminary discussions around needs, requirements, and their possible benefits or pitfalls.

Driver

Publisher

In order to detach from the regular cadence and relevant build processes, we proposed to:

amotl commented 2 months ago

Hi there,

I appreciate the idea of using setuptools extra labels for slicing the distribution on behalf of defining a subset of dependencies. I think it is absolutely the right choice to use that mechanism for that very purpose.

On the other hand, as discussed, PyInstaller apparently expects an output name for the binary executable. This one, I strongly believe, goes orthogonal to the dependency selection process, and should be conveyed to the "driver" on behalf of a separate variable.

In order to wrap it together, I would be fine if the "driver" synthesizes it from a single unique label/tag, by e.g. mapping it from cfr => cratedb-cfr-{version}.exe, or such.

Just sharing my humble thoughts on this matter, maybe possible to consider, otherwise please »go ahead« ;].

With kind regards, Andreas.

amotl commented 2 months ago

Status Update: GHCR is not suitable

It looks like GHCR is not suitable to host and distribute standalone artefacts of arbitrary nature. The means of what GHCR provides, is being a registry and provider for OCI images, which, for example in case of Homebrew, are apparently being unwrapped by the brew installer program, in order to derive its "bottles" packages out of them again.

-- https://github.blog/2021-06-21-github-packages-container-registry-generally-available/

In that spirit, it is not suitable for our use case, and we need to find a different solution. Maybe JFrog, maybe just slap it onto our HTTP server, like we are doing it with the cratedb-prometheus-adapter, and also others like the standalone version of crash? https://build.opensuse.org/ could also be an option, but it might be too much focused on building and matters of Linux, to be an adequate generic solution for distributing binaries of arbitrary nature.

We will continue on this topic next week, and will also be happy about any suggestions, when applicable.

amotl commented 2 months ago

A Build Matrix and Upload to GitHub Workflow Artifacts

Hi again. As a start, I've expanded @seut's patch (thanks!) by adding a corresponding GitHub Actions workflow recipe that defines a build matrix and invokes poe build-cfr to build and publish relevant artifacts to GitHub Workflow Artifacts with 7fb9df3bb.

Example

Workflow: https://github.com/crate-workbench/cratedb-toolkit/actions/runs/9826830191

Screenshots image image

amotl commented 2 months ago

Backlog

Building upon this, we can think about other build- and publishing-destinations/-procedures/-cycles on behalf of subsequent iterations.

@hammerhead was quick to spot that the current procedure is not sufficiently sustainable yet, see https://github.com/crate/cratedb-guide/pull/55#discussion_r1668404605:

What is the approach to keep these links to release bundles up-to-date? I noticed it currently links to a specific GitHub Action run. Is there a possibility to have the artifacts as part of the regular release assets (https://github.com/crate-workbench/cratedb-toolkit/releases), so updating it here is just a matter of keeping it in sync with the latest cratedb-toolkit version number?

Yes, we need to improve the release and publishing procedure. @seut and I discussed it already, but we did not want to block the current minimal implementation iteration because of other obligations.

NB: The maximum default retention time for GitHub Workflow Artifacts is 90 days. So, we need to improve this within the next three months. I think it is feasible.