status-im / infra-status-legacy

Infrastructure for old Status fleet
https://github.com/status-im/nim-waku
1 stars 3 forks source link

Create new main Status fleet #1

Closed jakubgs closed 2 years ago

jakubgs commented 2 years ago

As requested by @John-44 and @iurimatias we need a new nim-waku fleet separate from infra-nim-waku fleet which will first be used in a new Desktop release and ultimately by Mobile and other Status applications as well.

Requirements:

jakubgs commented 2 years ago

I've deployed the first 3 hosts for the status.test fleet: https://github.com/status-im/infra-status/commit/343c6a5f https://github.com/status-im/infra-status/blob/343c6a5fc070a4d2adcc8a8546703052ab1df80f/ansible/inventory/test#L4-L6

jakubgs commented 2 years ago

I've added configuration of nim-waku nodes and deployed them: https://github.com/status-im/infra-status/commit/380ca2ef

 > a all --become-user=admin -a 'docker ps' 
node-01.do-ams3.status.test | CHANGED | rc=0 >>
CONTAINER ID   NAMES      IMAGE                                CREATED       STATUS
18706db249c5   nim-waku   statusteam/nim-waku:deploy-v2-prod   2 hours ago   Up 2 hours (healthy)
node-01.gc-us-central1-a.status.test | CHANGED | rc=0 >>
CONTAINER ID   NAMES      IMAGE                                CREATED       STATUS
265dcd5b9a80   nim-waku   statusteam/nim-waku:deploy-v2-prod   2 hours ago   Up 2 hours (healthy)
node-01.ac-cn-hongkong-c.status.test | CHANGED | rc=0 >>
CONTAINER ID   NAMES      IMAGE                                CREATED       STATUS
08a19c1476e9   nim-waku   statusteam/nim-waku:deploy-v2-prod   2 hours ago   Up 2 hours (healthy)

I found some bugs with how we connect to fleet peers in the process:

jakubgs commented 2 years ago

And I've added the fleet to be visible on https://fleets.stauts.im/: https://github.com/status-im/infra-misc/commit/30f54ab6

 > c fleets.status.im | jq '.fleets."status.test"'
{
  "tcp/p2p/waku": {
    "node-01.ac-cn-hongkong-c.status.test": "/ip4/47.242.233.36/tcp/30303/p2p/16Uiu2HAm2BjXxCp1sYFJQKpLLbPbwd5juxbsYofu3TsS3auvT9Yi",
    "node-01.do-ams3.status.test": "/ip4/64.225.81.237/tcp/30303/p2p/16Uiu2HAkukebeXjTQ9QDBeNDWuGfbaSg79wkkhK4vPocLgR6QFDf",
    "node-01.gc-us-central1-a.status.test": "/ip4/34.122.252.118/tcp/30303/p2p/16Uiu2HAmGDX3iAFox93PupVYaHa88kULGqMpJ7AEHGwj3jbMtt76"
  }
}

@iurimatias could you check it out if it's what you want before I deploy a prod fleet?

jakubgs commented 2 years ago

One thing that we'll need to really start using nim-waku in production is a canary to monitor those nodes:

@jm-clius can we please prioritize a CLI canary tool so we can check availability of our nodes?

jakubgs commented 2 years ago

@iurimatias can you check the test fleet please?

iurimatias commented 2 years ago

will do

richard-ramos commented 2 years ago

The following PRs were created to use status.test in Desktop:

I was able to use relay protocol with no issues, but it seems that store protocol is missing --persist-messages=true? I was not able to retrieve messages.

richard-ramos commented 2 years ago

What's the nim-waku commit used for this fleet?

jakubgs commented 2 years ago

Same as for wakuv2.prod: https://github.com/status-im/infra-status/blob/380ca2ef46e3b99a12acd8d9df04d7eb12c6c443/ansible/group_vars/status.test.yml#L2

jakubgs commented 2 years ago

We do have --persist-messages=true enabled:

 > a all -a 'grep persist /docker/nim-waku/docker-compose.yml' 
node-01.do-ams3.status.test | CHANGED | rc=0 >>
      --persist-peers=true
      --persist-messages=true
node-01.gc-us-central1-a.status.test | CHANGED | rc=0 >>
      --persist-peers=true
      --persist-messages=true
node-01.ac-cn-hongkong-c.status.test | CHANGED | rc=0 >>
      --persist-peers=true
      --persist-messages=true
jakubgs commented 2 years ago

Looks like time has come for the status.prod fleet to be deployed. I'm thinking 3x3 hosts, or at least 2x3.

jakubgs commented 2 years ago

Deployed two hosts in each DC for status.prod fleet: https://github.com/status-im/infra-status/commit/64abaa95 https://github.com/status-im/infra-status/blob/64abaa95f81efada7c95c26817f04f11c496a1fe/ansible/inventory/prod#L3-L9

jakubgs commented 2 years ago

And configured the nodes on the fleet: https://github.com/status-im/infra-status/commit/a0886ef8

They are using deploy-status-prod tag built from this CI job: https://ci.status.im/job/nim-waku/job/deploy-status-prod/

jakubgs commented 2 years ago

I still need to add a mechanism to control node keys, so we can restore the same full address even after host is replaced.

jakubgs commented 2 years ago

Added support for specifying node keys, and added them to BitWarden:

I consider this done. We can scale it up in separate issues.