cosmos / cosmos-sdk

:chains: A Framework for Building High Value Public Blockchains :sparkles:
https://cosmos.network/
Apache License 2.0
6.19k stars 3.58k forks source link

Docker in the Cosmos SDK #11688

Closed faddat closed 1 year ago

faddat commented 2 years ago

Hi, I've been working on the SDK directly lately, and it's been fun. But the Docker situation has not been fun at all.

I shall opine now.

Few years back, it became fashionable-- not effective-- to minimize Docker containers in every possible case, culminating in the use of so-called distroless containers.

Maybe the distroless containers are fine as execution environments, but I'm not even too sure about that.

I'd like to get feedback on several options, and suggest the one that I prefer:

1) Remove all Dockerfiles: this is mean to users and we should not do it

2) Have a single Dockerfile: This is my preference. I've surveyed the containers and we could easily have one Dockerfile. This dockerfile could work for external repositories, like IAVL and tm-db, too. It could have cleveldb and rocksdb preinstalled, and we could generate it frequently.

3) Have many Dockerfiles and dependencies on Dockerfiles that aren't in this repo: this is mean to users, and we should not do it

Let's do versioning

We should always use the latest versions of Go and the container OS (we mainly use Alpine) so for example if we wanted to start a docker container we would:

FROM golang:alpine

apk add rocksdb leveldb git python-3 

...and so on

This way, the OS in our containers would always be up to date, as would the Go programming language. If a build breaks because of versioning, like RocksDB would have when v7 was released, that should be a signal to us that we need to fix something, instead of specifying potentially obsolete software.

The container image

I figure that we should default to either Alpine (this isn't my preference because no one uses musl in real life) or https://hub.docker.com/r/faddat/archlinux -- which is my preference because Arch is a rolling release distribution that has the latest security and stability patches applied almost immediately, almost all the time, and it uses glibc, which is what peoiple actually use.

The disadvantage of faddat/archlinux is that I made it, which is somewhat sus, compared to alpine or ubuntu.

we should not use ubuntu for containers

The reality that Ubuntu thinks that it's alright to ship years-old releases of golang in their package manager says it all.

If colleagues could please opine-- I'd love to reduce the SDK to a single Dockerfile located in the repository root, with all relevant dependencies in it, and it is fine by me if that's alpine based or based on faddat/archlinux.

but sir, why not just use arch

Sadly, they do not ship ARM builds in any official manner, and faddat/archlinux combines arch linux and arch linux arm to provide a multiarch build environment

Let's drop docker hub

using ghcr would allow us to have the images right here, and I think that is better.

faddat commented 2 years ago

Another example of using arch to save time and effort while testing against the latest versions of relevant software

https://github.com/osmosis-labs/tm-db/blob/e201b403aaafc222d69c92a4d22c9dfd193ae251/.github/workflows/ci.yml#L26

faddat commented 2 years ago

Dockerfile reduced to a single line:

https://github.com/osmosis-labs/tm-db/blob/e5f705e62dc2ffbc98dca34df86b36445b07220b/tools/Dockerfile

alexanderbez commented 2 years ago

ACK to:

faddat commented 2 years ago

ack your ack on 2/3 items, let's compare an alpine Dockerfile to an Arch Dockerfile to a Debian Dockerfile and keep in mind that no one uses musl, and that musl does cause problems.

alpine

# Simple usage with a mounted data directory:
# > docker build -t simapp .
#
# Server:
# > docker run -it -p 26657:26657 -p 26656:26656 -v ~/.simapp:/root/.simapp simapp simd init test-chain
# TODO: need to set validator in genesis so start runs
# > docker run -it -p 26657:26657 -p 26656:26656 -v ~/.simapp:/root/.simapp simapp simd start
#
# Client: (Note the simapp binary always looks at ~/.simapp we can bind to different local storage)
# > docker run -it -p 26657:26657 -p 26656:26656 -v ~/.simappcli:/root/.simapp simapp simd keys add foo
# > docker run -it -p 26657:26657 -p 26656:26656 -v ~/.simappcli:/root/.simapp simapp simd keys list
# TODO: demo connecting rest-server (or is this in server now?)
FROM golang:alpine AS build-env

# Install minimum necessary dependencies
ENV PACKAGES curl make git libc-dev bash gcc linux-headers eudev-dev python3
RUN apk add --no-cache $PACKAGES

# Set working directory for the build
WORKDIR /go/src/github.com/cosmos/cosmos-sdk

# Add source files
COPY . .

# install simapp, remove packages
RUN make build-linux

# Final image
FROM alpine:edge

# Install ca-certificates
RUN apk add --update ca-certificates
WORKDIR /root

# Copy over binaries from the build-env
COPY --from=build-env /go/src/github.com/cosmos/cosmos-sdk/build/simd /usr/bin/simd

EXPOSE 26656 26657 1317 9090

# Run simd by default, omit entrypoint to ease using container with simcli
CMD ["simd"]

debian

# This file defines the container image used to build and test tm-db in CI.
# The CI workflows use the latest tag of tendermintdev/docker-tm-db-testing
# built from these settings.
#
# The jobs defined in the Build & Push workflow will build and update the image
# when changes to this file are merged.  If you have other changes that require
# updates here, merge the changes here first and let the image get updated (or
# push a new version manually) before PRs that depend on them.

FROM golang:1.17-bullseye AS build

ENV LD_LIBRARY_PATH=/usr/local/lib

RUN apt-get update && apt-get install -y --no-install-recommends \
    libbz2-dev libgflags-dev libsnappy-dev libzstd-dev zlib1g-dev \
    make tar wget

FROM build AS install
ARG LEVELDB=1.20
ARG ROCKSDB=7.0.3

# Install cleveldb
RUN \
  wget -q https://github.com/google/leveldb/archive/v${LEVELDB}.tar.gz \
  && tar xvf v${LEVELDB}.tar.gz \
  && cd leveldb-${LEVELDB} \
  && make \
  && cp -a out-static/lib* out-shared/lib* /usr/local/lib \
  && cd include \
  && cp -a leveldb /usr/local/include \
  && ldconfig \
  && cd ../.. \
  && rm -rf v${LEVELDB}.tar.gz leveldb-${LEVELDB}

# Install Rocksdb
RUN \
  wget -q https://github.com/facebook/rocksdb/archive/v${ROCKSDB}.tar.gz \
  && tar -zxf v${ROCKSDB}.tar.gz \
  && cd rocksdb-${ROCKSDB} \
  && DEBUG_LEVEL=0 make -j4 shared_lib \
  && make install-shared \
  && ldconfig \
  && cd .. \
  && rm -rf v${ROCKSDB}.tar.gz rocksdb-${ROCKSDB}

Arch

# This file defines the container image used to build and test tm-db in CI.
# The CI workflows use the latest tag of tendermintdev/docker-tm-db-testing
# built from these settings.
#
# The jobs defined in the Build & Push workflow will build and update the image
# when changes to this file are merged.  If you have other changes that require
# updates here, merge the changes here first and let the image get updated (or
# push a new version manually) before PRs that depend on them.

FROM faddat/archlinux

RUN pacman -Syyu --noconfirm rocksdb leveldb go git base-devel

these do the same thing, except that the arch image will stay up to date and has all of the tooling needed to use any database we support and if we wanted to add Rust for some reason it would look like:

# This file defines the container image used to build and test tm-db in CI.
# The CI workflows use the latest tag of tendermintdev/docker-tm-db-testing
# built from these settings.
#
# The jobs defined in the Build & Push workflow will build and update the image
# when changes to this file are merged.  If you have other changes that require
# updates here, merge the changes here first and let the image get updated (or
# push a new version manually) before PRs that depend on them.

FROM faddat/archlinux

RUN pacman -Syyu --noconfirm rocksdb leveldb go git base-devel rust

I did remove cruft from both Arch examples, like:

ENV LD_LIBRARY_PATH=/usr/local/lib

because it is not necessary. And when Go upgrades, the container does, too.

alexanderbez commented 2 years ago

I'm fine with using Arch, tbh this isn't my area of expertise. But is there a reason why we'd depend on a fork or custom image (yours)? Is there not a canonical or official arch distro image?

faddat commented 2 years ago

Awesome-- ok so, yeah this part is less than ideal:

Arch linux is two:

https://archlinuxarm.org - ARM https://archlinux.org/ - AMD64

and this repo:

https://github.com/faddat/archlinux-docker

grabs both the AMD64 and ARM filesystem snapshots, and then combines them into a multiarch docker manifest so that when we're using it, we do not need to do gymnastics like:

https://github.com/CosmWasm/wasmd/blob/57ead1ade3d4ce7c6c510977ea6dfdbab0e6c970/Dockerfile#L4

Instead you just tell which system architectures you're targeting:

https://github.com/cosmos/iavl/blob/2f3f512a36bb075cdc28b6ab7999b5c5b9c24bfe/.github/workflows/docker.yml#L39

The Docker image and build env can stay exactly the same from cloud to metal to embedded.

https://hub.docker.com/r/faddat/archlinux

It's proven fairly popular, and the image builds every hour in Github:

https://github.com/faddat/archlinux-docker/actions

If there are single point of failure (me) matienance concerns, we could fork it into either the Notional or Cosmos org, too.

If it weren't for the musl issue, I'd likely prefer Alpine because of the larger user base, but I've seen musl cause unpleasantness time and again so I'd rather just avoid it.

alexanderbez commented 2 years ago

SGTM!

faddat commented 2 years ago

Stage two

decide weather or not to keep all of base-devel:

db-5.3.28-5  diffutils-3.8-1  gc-8.2.0-3  gdbm-1.23-1  guile-2.2.7-2  hwloc-2.7.1-1  icu-71.1-1  jemalloc-1:5.2.1-6  libelf-0.186-5  libisl-0.24-4  libmpc-1.2.1-2  libnsl-2.0.0-2  libpciaccess-0.16-2  libseccomp-2.5.4-1  liburing-2.1-1  libxml2-2.9.13-2  mpfr-4.1.0.p13-2  perl-5.34.1-1  perl-error-0.17029-3  perl-mailtools-2.21-5  perl-timedate-2.33-3  shadow-4.11.1-1  snappy-1.1.9-2  tar-1.34-1  tbb-2021.5.0-1  util-linux-2.38-1  autoconf-2.71-1  automake-1.16.5-1  binutils-2.38-4  bison-3.8.2-4  fakeroot-1.28-1  file-5.41-1  findutils-4.9.0-1  flex-2.6.4-3  gawk-5.1.1-1  gcc-11.2.0-4  gettext-0.21-2  git-2.36.0-1  go-2:1.18.1-1  grep-3.7-1  groff-1.22.4-7  gzip-1.12-1  leveldb-1.23-3  libtool-2.4.7-1  m4-1.4.19-1  make-4.3-3  pacman-6.0.1-4  patch-2.7.6-8  pkgconf-1.8.0-1  python-3.10.4-1  rocksdb-7.0.4-1  sed-4.8-1  sudo-1.9.10-1  texinfo-6.8-2  which-2.21-5
tac0turtle commented 2 years ago

Id prefer not to merge into one docker image. this will cause lots of confusion, in the near future we will be deprecating some images. Lets use the default image provided, if archlinux has one lets use that.

faddat commented 2 years ago

Marko,

yeah....

So, I've scrapped arch and reduced everything to a single docker image.

One thing to note though, my environment is still a rolling release.

Arch does have a default image, but that doesn't natively support arm. Here's what the new dockerfile looks like:

# This dockerfile will be used to bootstrap the docker changes for the cosmos SDK, and will be built hourly.

FROM manjarolinux/base

ENV GOPATH=/go
ENV PATH=$PATH:/go/bin

# everything needed for cosmos development
RUN pacman -Syyu --noconfirm go base-devel git leveldb rocksdb make python snappy git-lfs jq wget curl protobuf yarn rustup npm unzip docker

# Useful tools
# concept: we want every tool frequently used when building in cosmos.
RUN go install mvdan.cc/gofumpt@latest && \
      go install github.com/cweill/gotests/gotests@latest && \
      go install github.com/fatih/gomodifytags@latest && \
      go install github.com/josharian/impl@latest && \
      go install github.com/haya14busa/goplay/cmd/goplay@latest && \
      go install github.com/go-delve/delve/cmd/dlv@latest && \
      go install honnef.co/go/tools/cmd/staticcheck@latest && \
      go install golang.org/x/tools/gopls@latest && \
      go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest && \
      go install google.golang.org/protobuf/cmd/protoc-gen-go@latest && \
      go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest && \
      go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.16.0 && \
      go install github.com/cosmos/cosmos-proto/cmd/protoc-gen-go-pulsar@latest && \
      go install honnef.co/go/tools/cmd/staticcheck@latest && \
      go install github.com/bufbuild/buf/cmd/buf@latest

RUN rustup toolchain install stable

# Install slightly more complicated but still useful tools
RUN git clone https://github.com/cosmos/cosmos-proto && \
      cd cosmos-proto && \
      go install ./... && \ 
      cp /go/bin/* /usr/bin && \
      rm -rf /go && \
      mkdir /go && \
      chmod -R 777 /go

This natively supports arm so we can build for arm and x86 without all the crazy stuff in my former PR.

tac0turtle commented 1 year ago

@faddat are you still open to working on this?

mridhul commented 1 year ago

@tac0turtle is this still needed ? I can take a look

tac0turtle commented 1 year ago

yes this is still open, but i think there are two docker files that may be able to be combined, otherwise the other ones should stay separate. A single docker image doesnt make sense here

faddat commented 1 year ago

@tac0turtle - I am open to working on it, and open to working on it with @mridhul

(does anyone have any sane way to manage gh notifications? this kills me)

@mridhul - if you want to, feel free to ping me on twitter (twitter.com/gadikian)