cosmos / chain-registry

Creative Commons Attribution 4.0 International
493 stars 1.16k forks source link

Infrastructure Automation #214

Open faddat opened 2 years ago

faddat commented 2 years ago

I was putting together a table with all the edge cases and then noticed that there were 50 chains. So I'm going to summarize:

User Stories

As a scaled relayer, I want others to be able to begin relaying with ease

As a scaled relayer, I want to be able to run more nodes in a stable manner and scale my system by adding either bare metal or virtual machines to my network

As a validator, I want to be able to retrieve chain state rapidly.

blockers

A Heartwarming open source story

Chain developers benefit from systems that allow communities to provide needed infrastructure with ease. A good example of this is the block explorer for dig. Chill validation used ping.pub's excellent open source explorer to make an explorer for dig on day one. We did not have to build that infrastructure, and it was a huge load off of our shoulders.

Later, I made a personal donation to ping and our team delegates to both ping and chill, who are advancing the same codebase in different ways, kind of like how Tenderseed1->Tenderseed2->Tinyseed->(it really forked a lot and I can't track all the names terra teams have given it)

Open source infrastructure automation for cosmos

To drill in on infrastructure for a moment, we need it very badly, and we have conditions:

Possible solutions for chain images

My opinion on the images is that we should ship multiarch where possible, do a distroless container because some people love kube, and build binaries in arch, while shipping an arch-based docker image that includes the binary. That OS cruft can be very helpful sometimes and doesn't weigh much or pose much of a security risk.

Solutions for getting chain state

My opinion is that the cosmos community needs a publicly accessible docker registry that serves docker images that include chain state. We can put it somewhere with an unmetered 10gbps line and equip it with plenty of storage. Users would download images that weigh between 1-15gb as docker images and they can use volume mounts to ensure persistence if they'd like to.

Infrastructure Automation tooling

Both swarm and kube have relatively easy to use bare metal and scalable cloud versions. They both support x86 and arm, though neither will satisfy my desire to learn more of the ways of the Hashicorp. It's my opinion that Kube treats every problem like it is a google-scale problem, and that swarm is more accessible to new users, while still scaling admirably.

User flow

The user flow should be the same anywhere-- you'll provision one or more servers, and write a compose file or helm chart (or whatever is in vogue with kube these days) and in that compose file you'll let your system know how many replicas you want. The state will already be in the images, and you'll be able to frequently re-sync so that you are able to save on the need for high iops disks and large disks.

Endpoints

It should be easy for users to configure both https and wss for their chain endpoints, and as we progress towards removing the traditional rest endpoints, we should make sure to have the grpc gateway well documented.

How to do

I am fiddling with

https://github.com/ovrclk/cosmos-omnibus

and currently run some cosmos infrastructure in Akash. I'd like to run much more of it in Akash, but often like a greater degree of control than it offers. Today I am going to see if I can make some docker-compose.yml files that use omnibus to either state sync or ship truncated images inside.

The eventual solution must be triggered when pull requests hit this repository, so that images stay up to date. Yet another reason to avoid closed source "blockchains"

baabeetaa commented 1 year ago

https://github.com/notional-labs/cosmosia