dogecoin / docker

Dogecoin Core on Docker
MIT License
27 stars 18 forks source link

Repository Architecture : image base distribution and architecture #7

Closed AbcSxyZ closed 2 years ago

AbcSxyZ commented 2 years ago

Following comments from PR https://github.com/dogecoin/docker/pull/2#discussion_r748547687 about used distribution & architecture for the Dockerfile.

I suggest speaking about architecture of the repository to see what would be possible to do to deal with versions, cross-platform portability & distributions. To clarify and move in a common direction.

Cross-platform portability

I wasn't sure, but Docker do not manage hardware abstraction, because it uses the kernel, setting up a predefined software layer.

ARM & other specific architecture would need some tricks to be handled. Docker & Docker Hub are providing two ways to manage cross-platform images, the simplest way is using docker buildx, the alternative is to create manually a manifest from multiple published image/tag using docker manifest. See docker blog.

docker buildx seems to not work, x86_64 executable is not able to launch on ARM even with qemu emulation, creating an exec format error, not sure why.

Alternatively, building a root manifest from multiple tag is possible, resulting in a single version endpoint to call for all architectures. Multiple tag will be pushed for each architecture, using with the appropriate executable, and then everything is grouped under the same tag. For now, it's what I'm having.

You can find created images for both method on my Docker Hub.

So I've no certitude about the good path, but maybe docker manifest is something to explore. For now, I don't know if we have to create a folder for each architecture and how to publish all Dockerfiles. What is your current opinion ?

Providing multiple distribution

I'm not clear about the benefit of providing multiple distros. I know about alpine for the size, @patricklodder was speaking about upgrades :

Even if we were to choose to just support 1 distro (eg: debian) then there's still the issue that the Dogecoin Core release cycles are not at all in sync with distro release cycles. Since the goal must be to make production images (otherwise we may as well just keep all this in user repos and not care) we will have to think a bit about upgrade policies. You don't want to force updates cross distro release but instead enable custom security policies, the former will just lead to no one using the images we are proposing here at some point. I think that we'll have to maintain, say, dogecoin:1.14.6-bullseye-slim and dogecoin:1.14.6-bookworm-slim concurrently in the most minimal of supported cases.

You want to support at least to successive version of a distribution to let user selecting image for a potential migration ? Because I agree giving access to alpine or whatever could be great, but may be wise to limit ourselves, at least for maintenance reason, and develop it later depending on the demand.

xanimo commented 2 years ago

With regards to alpine, I already have most if not all development done, just need to organize and would prefer moving in this direction as snyk vuln scanner (i know, i know lol) shows no known CVE's where as ubuntu/debian et al are chalked full.

And just a comment on buildx, it's a pain to get setup but does have some pretty nice features.

patricklodder commented 2 years ago

@AbcSxyZ Thanks for this!

Re: buildx or manifest.

I agree that buildx is simple. The simplicity will come at the cost of having to do custom binaries rather than the gitian-built ones and I feel that that's risky to publish, even if we use depends, because it'll be hard to make it reproducible. Scanners will help with system libraries but not with Dogecoin Core itself unless someone publishes a CVE (and then it's too late.) Maybe we can do custom gitian builds with our own descriptors but I feel a bit that that beats the "simplicity" argument.

I'll also spend some play time with manifest. Let's compare notes after we got some play time in.

Re: Providing multiple distributions

The biggest difference between releasing a piece of software and releasing a container is that the latter includes system software. In many of the organizations I've worked with, system software have policies applied to them in terms of update frequency, but also when to start using a version. Many orgs do not want to be on the bleeding edge. This is especially important to take note of in containerized environments because it's super easy to create a container ubuntu:latest, or on the other side of the spectrum, ubuntu:trusty. Therefore, I think that if you want to do a serious production release, it benefits from supporting multiple policies (this is also why in the past I have warned people that maintaining a production docker image is hard work, and why we now have this separate repository.)

Long story short - cannot just pick the latest and run with only that. I think it's important to have some (minimal) process around lifecycle management. Since the amount of code managed in this repo should be significantly less than dogecoin/dogecoin, we could probably pull that off with relative ease, as least as it pertains to distro versions.

Re: alpine / debian.

I definitely - long-term - see the need for having at least a debian and an alpine version, but that doesn't have to be realized right this moment. We don't have precompiled musl releases (yet), so there's time to figure this out. Personally though, I run zero debian-based containers in production and only use alpine, because less stuff in the image means less vulnerability monitoring, but there's also the case to make that glibc has more eyes on it than musl... so it's really a matter of preference.

Proposal

Right now, let's make one really good debian image for x86_64 (with 1.14.5, because vulns), but plan ahead a bit on the tooling and structure that we use. It's easier to refactor back from multi-image/multi-version/multi-arch than to refactor into it. But we can tune.

xanimo commented 2 years ago

I agree that we should start small and focus on one image build atm however for the future I propose if we are to target multi-arch we keep in line with docker-library's bashbrew oci-platform.go and use https://github.com/opencontainers/image-spec/blob/v1.0.0/image-index.md as standard

    "amd64":    {OS: "linux", Architecture: "amd64"},
    "arm32v5":  {OS: "linux", Architecture: "arm", Variant: "v5"},
    "arm32v6":  {OS: "linux", Architecture: "arm", Variant: "v6"},
    "arm32v7":  {OS: "linux", Architecture: "arm", Variant: "v7"},
    "arm64v8":  {OS: "linux", Architecture: "arm64", Variant: "v8"},
    "i386":     {OS: "linux", Architecture: "386"},
    "mips64le": {OS: "linux", Architecture: "mips64le"},
    "ppc64le":  {OS: "linux", Architecture: "ppc64le"},
    "riscv64":  {OS: "linux", Architecture: "riscv64"},
    "s390x":    {OS: "linux", Architecture: "s390x"},

    "windows-amd64": {OS: "windows", Architecture: "amd64"},

This can be programmed as follows and of course dependent on which platforms dogecoin supports and has valid releases for:

RUN set -ex \
  && if [ "${TARGETPLATFORM}" = "linux/amd64" ]; then export TARGETPLATFORM=x86_64-linux-gnu; fi \
  && if [ "${TARGETPLATFORM}" = "linux/arm64v8" ]; then export TARGETPLATFORM=aarch64-linux-gnu; fi \
  && if [ "${TARGETPLATFORM}" = "linux/arm/v7" ]; then export TARGETPLATFORM=arm-linux-gnueabihf; fi \
  && wget https://github.com/dogecoin/dogecoin/releases/download/v${DOGECOIN_VERSION}/dogecoin-${DOGECOIN_VERSION}-${TARGETPLATFORM}.tar.gz \

To clarify further since I'm thinking about directory layout, I think we should have dogecoin release versions as top level directories and within each we can have each supported architecture and within that supported operating systems. Let me know if this sounds reasonable when the time comes.

With regards to maintenance I think it's obviously daunting since this is the beginning but ultimately all of this should be configured to be fully automated, for instance each push to dogecoin/dogecoin.git can be set up with a CI pipeline to docker hub that will automatically build and release new images for the specified branch. Of course more pressing is setting up this repository first but you get the idea. Anyway just some thoughts.

AbcSxyZ commented 2 years ago

I was figuring out how other C/C++ software are published with Docker hub, I went through images maintained by the Docker community, and php particularly.

What if the answer isn't to use executables from the release, but to build from sources and to use buildx to manage architectures ?

It could be the way to finish with a single folder per version, with subfolders for distributions + an extra script to generate files for each release. Having a more common structure like that:

.
└── version
    ├── distro1
    │   └── Dockerfile
    └── distro2
        └── Dockerfile

I pushed a cross-platform image from amd using buildx, available here, with the following Dockerfile :

FROM debian:buster

RUN apt update && apt install -y dpkg-dev

RUN dpkg-architecture --query DEB_BUILD_GNU_TYPE > /architecture

ENTRYPOINT ["bash"]

By downloading the image from my laptop & a raspberry, I have the following result:

AMD output for architecture file : x86_64-linux-gnu ARM output : arm-linux-gnueabihf

We could use depends and normal build step to manage cross-platform configuration. Feasible, right ?
The build can be long, but people will mainly use pre-build image downloaded from Docker Hub, they would have to wait the download time of the image with a size of X mo. And GitHub action would manage of it.

patricklodder commented 2 years ago

What if the answer isn't to use executables from the release, but to build from sources and to use buildx to manage architectures ?

We could use depends and normal build step to manage cross-platform configuration. Feasible, right ?

Hard no. You're proposing to circumvent all binary security and replace it with trust. We have a trustless built process for a reason: that no one can cheat people out of their money. This is important because people will run wallets on these containers.

However: I still like your structure! If a configure script can figure out the host and then compile, so can we, and then download the binary. I'll work on this - we can have security AND multi-platform Dockerfiles ❤️

AbcSxyZ commented 2 years ago

Fine, should be doable to download the right release during this step. Looks promising !

I think when will have done the following, it can be pretty nice:

Personally, I think I will work on documentation next while you're working on architectures.

AbcSxyZ commented 2 years ago

This image for bitcoin is doing this kind of things, from ruimarinho/bitcoin-core. I understand finally the trick with TARGETPLATFORM mentionned by @xanimo...

AbcSxyZ commented 2 years ago

Somehow achieved. Cross architecture build is handled with https://github.com/dogecoin/docker/pull/14, distribution choice is in progress and should go smoothly.