tendermint / tm-db

Common database interface for various database backends for Tendermint Core and Cosmos SDK
Apache License 2.0
89 stars 136 forks source link

Update docker.yml #240

Closed faddat closed 2 years ago

faddat commented 2 years ago

only login if it's been accepted

faddat commented 2 years ago

This is aiming at something more:

we should never use alpine.

When we use alpine, we take a lot of risk

creachadair commented 2 years ago

we should never use alpine.

When we use alpine, we take a lot of risk

Sorry, I don't understand the reasoning behind this. Can you say more?

One of the main points of Alpine is to reduce the surface area of the base image. That generally both improves performance and reduces exposure.

I usually prefer Alpine base images unless for some reason the build won't work. (Here, as far as I can tell, there's no reason it shouldn't)

faddat commented 2 years ago

I am of mixed minds here, and I'll jump into this in some detail now:

To the best of my knowledge, the minority (5%? At notional we run containers, but only when we are running with cosmosia: https://github.com/notional-labs/cosmosia) of operators are running containerized.

So when we make muslc influenced build assumptions because our build systems use muslc, we introduce another variable that wouldn't already exist.

You're not precisely wrong here, but I think the ops story differs from the vision you've got.

Our validators all run on the plainest, most boring metal we can find-- typically (like the container in question) arch linux, and nothing else.

creachadair commented 2 years ago

You're not precisely wrong here, but I think the ops story differs from the vision you've got.

Our validators all run on the plainest, most boring metal we can find-- typically (like the container in question) arch linux, and nothing else.

I take your point. I wouldn't expect a generic image from Dockerhub to meet the needs of sophisticated networks with a complex deployment story, though. That's true regardless what we choose for a base image, and I think that's probably fine.

For a network with very specific security, architecture, or compatibility requirements, I'd expect the operators of that network will want to define their own images. (In practice, dockerhub is not the origin of choice for production networks, simply on availability grounds).

I think the main goal for this image definition is to provide users with a serviceable baseline to get started, try things out, maybe spin up a little testnet. It's as much a tutorial as anything else, and I think it's fine if it does not cover every conceivable use case.

By the time we get to the point of worrying about the size and ABI compatibility of static binaries, I think we're squarely outside the user base for the image defined here.

On that basis we should be using ubuntu:latest. 🙂

faddat commented 2 years ago

You'd be right!

And I tried. So shall we go all the way?

You see, ubuntu:latest does not keep dependencies up to date- go is out of date, rocksdb is out of date... everything is. So in the end if you're aiming for performance it is pretty hard to work with ubuntu.

It does tick the multiarch boxes, but the code updates just aren't there, and (opinion) that's how we end up with cosmos/gorocksdb in our lives. We are making it tough to update the docker image, and typically we don't build it.

So, stuff changes upstream and we can't keep up. Arch is a rolling release distro, so if we were to use it, the build would from time to time break (tho in my experience this hasn't actually happened yet) and at that point we'd get a chance to bump dependencies and the like.

I figure that ol tm-db has about another year in it.

So I'm figuring to get various updates to it landed, and then work up the stack towards the paces where greater performance boosts are possible.

little context-- for me personally:

validating osmosis -> relaying -> validating many chains -> relaying more -> helping to provide rpc services to a number of chains

And while the performance definitely affects relaying, it is in fact the rpcs that really suffer most.

So when I get state sync working with xyz db, or eke out a 5% reduction in sync time, I'm picturing that across 100+ engineers.

creachadair commented 2 years ago

And I tried. So shall we go all the way?

Sorry, all the way to which?

My general view is that Alpine is a good happy medium between the antediluvian default dependency management of Ubuntu and more fringe distros that are likely to complicate life for folks trying to set up and get running.

If we had to pick which direction to go, I'd rather move toward Arch than Ubuntu, but that's mostly hypothetical: I think our goal here should be light touch, minimum disruption of existing use.

faddat commented 2 years ago

In that case maybe we leave it alone?

The reason this repo is hard to work on is the development environment needs rocksdb and cleveldb.

When I use arch, that defaults to the latest and greatest, and that causes problems with cosmos/gorocksdb.

github-actions[bot] commented 2 years ago

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.