status-im / infra-shards

Infrastructure for Status fleets
https://github.com/status-im/nim-waku
0 stars 2 forks source link

Static sharding fleets for Status Communities #2

Closed jm-clius closed 8 months ago

jm-clius commented 10 months ago

Requirements

Configuration common to both fleets:

Pubsub Topics

--pubsub-topic=/waku/2/rs/16/128
--pubsub-topic=/waku/2/rs/16/256

Protected Topics

--protected-topic=/waku/2/rs/16/128:045ced3b90fabf7673c5165f9cc3a038fd2cfeb96946538089c310b5eaa3a611094b27d8216d9ec8110bd0e0e9fa7a7b5a66e86a27954c9d88ebd41d0ab6cfbb91
--protected-topic=/waku/2/rs/16/256:049022b33f7583f34463f5b7622e5da29f99f993e6858a478465c68ee114ccf142204eff285ed922349c4b71b178a2e1a2154b99bcc2d8e91b3994626ffa9f1a6c

Configuration for status.sharding.bootstrap

Configuration for status.sharding.store

jm-clius commented 10 months ago

@richard-ramos @chaitanyaprem can you check that the guidelines here are clear enough, given our discussion on https://github.com/waku-org/nwaku/issues/1914. With the idea that go-waku drives Status dogfooding, I think the best would be for go-waku to also ensure that this fleet gets set up, so feel free to let me know if there's anything unclear or if you have some other suggestions!

@Ivansete-status do we have a guide/tutorial/example somewhere on how the PostgreSQL configuration is done on nwaku?

Ivansete-status commented 10 months ago

Hey @jm-clius ! We don't have an explicit doc. The next repo has an example of how an nwaku node is connected to a Postgres db. All in a docker compose file --> https://github.com/waku-org/nwaku-compose/blob/master/docker-compose.yml

richard-ramos commented 10 months ago

Thank you for these guidelines. I looked at them and they do look clear to me! Just to confirm: status.sharding.store will have DiscV5 enabled to find peers? or will it only rely on the initial connection done to the nodes retorned by status.sharding.bootstrap dns discovery URL?

chaitanyaprem commented 10 months ago

Thanks for the guidelines @jm-clius. So far, these are clear to me!

jm-clius commented 10 months ago

Just to confirm: status.sharding.store will have DiscV5 enabled to find peers?

Yes, I think they should have discv5 enabled.

jakubgs commented 10 months ago

Some questions:

richard-ramos commented 10 months ago

Do you actually want two fleets or are you talking about two types of nodes within one fleet?

Based on the description i understand it's two separate fleets, each one of them having their own DNS discovery URL and own settings, but interconnected by having status.sharding.store use status.sharding.bootstrap dns discovery url + discv5

richard-ramos commented 10 months ago

Why is this fleet called sharding and not shards as is the wakuv2 one?

I guess the name can be changed to match wakuv2 :thinking: no preference here

jm-clius commented 10 months ago

Answering:

Why is this fleet called sharding and not shards as is the wakuv2 one?

Ah, oversight/assumption. It should be shards. Actually the naming itself is a suggestion from my side, but I think status.shards.store and status.shards.bootstrap make sense.

How is this different from the wakuv2.shards fleet?

wakuv2.shards is indeed the blueprint that the Waku team use to harden most of the concepts we need for these status.shards.* fleets. The main differences are:

Is this intended for public use, and if not and it's intended for development

It's intended for public use in future. For now it is intended for Status Community dogfooding and must belong to Status.

Do you actually want two fleets or are you talking about two types of nodes within one fleet?

Two separate fleets with separate DNS node lists. But these fleets are closely associated in that the status.shards.store fleet must connect to the status.shards.bootstrap fleet. They will also both serve the same communities, so from the community's perspective form a unit.

What does "shared PostgreSQL backend mean?

My suggestion for starting point: "one database host shared across multiple data centers with different latency"

Previously (with SQLite) each store node would store the same copy (duplicate) of the entire history locally. With PostgreSQL we have the opportunity to only store fewer copies of history in a shared database, with multiple producers/writers. Deduplication happens on write to the database. This should improve data reliability. We can of course play with redundancy parameters here and have e.g. a postgresql db per data center. I think for dogfooding a single postgresql instance shared between all store nodes is a reasonable starting point. One of the aims of wakuv2.shards is to benchmark this approach more thoroughly. cc @Ivansete-status

jakubgs commented 10 months ago

Thanks for answering. One thing I want to make clear form the start is that there can't be something like status.shards.store. Our fleet names have two elements, env and stage, like so: ${env}.${stage}. For example wakuv2.prod. There is nothing in any of our automations or naming conventions that allows for a 3 segment fleet name.

If You need both words in the name then it would have to be status.shards-store and status.shards-boot.

jakubgs commented 10 months ago

My main worry with a shared DB across different data centers is that due to higher network latency the DB will spend more time than necessary locked during writes, due to delays on various control messages.

jm-clius commented 10 months ago

If You need both words in the name then it would have to be status.shards-store and status.shards-boot.

I'm happy with this naming convention.

My main worry with a shared DB across different data centers

Right! Think this is a good point. I have no real intuition here and this setup would need to be properly dogfooded in any case (we have no postgresql DBs in production yet). Perhaps a better solution here is to e.g. use only two different data centers for now for status.shards-store (3 store nodes in the one, 2 in the other) with each having its own shared postgresql database (2 in total).

TL;DR - let's opt for one postgresql instance per data center.

jm-clius commented 10 months ago

Sidenote: 5 was a number pulled out of a hat. Could of course also do 6 store nodes, which is more easily divisible between 2 or 3 data centers.

jakubgs commented 10 months ago

Okay, last question, what is a deadline for this? I would like to assign this task to @yakimant to get him aquainted with infrastructure work, especially in relation to Waku fleets. But that will naturally mean work will progress slower.

jm-clius commented 10 months ago

I think up to @richard-ramos and @chaitanyaprem in terms of driving Status Community dogfooding here. It's probably possible to do some limited initial dogfooding here using wakuv2.shards instead, which would give us some more time.

Ivansete-status commented 10 months ago

let's opt for one postgresql instance per data center. I like this approach a lot. We will implement the Store synchronization among data centers in the future.

If possible, we should have high-availaility and replication, but I think this can be added in further stages. For now is fine to have one Postgres per datacenter.


Re the shards names, neither status.shards-store or status.shards-boot seem to follow the pattern ${env}.${stage}. What about status-shards-store.prod or status-shards-boot.prod? ( cc @jakubgs @jm-clius )

jakubgs commented 10 months ago

What about status-shards-store.prod or status-shards-boot.prod?

But those are not prod, so why would we call them prod?

Ivansete-status commented 10 months ago

What about status-shards-store.prod or status-shards-boot.prod?

But those are not prod, so why would we call them prod?

I set it as prod because <<It's intended for public use in future>>. In any case, I think we need to add a valid stage label (test or staging or prod) so that we remember in the future the kind of traffic and data the fleet is supporting.

jakubgs commented 10 months ago

I think we shouldn't pretend a non-prod fleet is prod because it might be prod in the future. If we stabilize this setup then we'll probably create a separate new fleet(s) that apply this setup to new fleets.

And honestly then proper naming would be more like status-store.prod and status-boot.prod. Though i still don't fully get why you guys think those should be separate fleets. We already have heterogenous fleets that contains different types of hosts in them. You can see that in infra-eth-cluster which deploys eth.staging or eth.prod, which contains 3 types of hosts: boot, node, mail.

richard-ramos commented 10 months ago

It's probably possible to do some limited initial dogfooding here using wakuv2.shards instead, which would give us some more time.

I see no problems with using that fleet if it can provide topic protection for /waku/2/rs/16/128 and /waku/2/rs/16/256, store, and discv5! early stage dogfooding will be mostly focused on testing the behavior of status-go when sharding is applied to communities

Ivansete-status commented 10 months ago

I agree that we shouldn't name them prod then. I vote for: status-store.test and status-boot.prod

Having that kind of segregation I think is good because:

jakubgs commented 10 months ago

Clear picture of what a fleet contains

That already exists in infra-eth-cluster, see different files in group_vars.

Different fleets might have different update requirements

I don't see how that's relevant. If they work in tandem as Production they are essentially part of status main fleet, just different components of it. Think of it this way. If an application A requires a database D and a cache C, you don't deploy A, B, and C as separate fleets, you deploy them all as one fleet that contains different groups of servers: A, B, and C.

Different fleets might have different scaling requirements

Also irrelevant. You can scale them separate if you want to. See infra-eth-cluster.

jakubgs commented 10 months ago

Correct me if I'm wrong, but it seems to me like you are considering bootstrap and store as separate "fleets" purely because they have different utility, but I think that's wrong. I think those are tied together and are part of the same fleet.

It also doesn't make sense infrastructure config-wise. Since a fleet repo like infra-status configures a single type of fleet, and that fleet can have multiple stages. But those stages are mostly the same in layout and configuration, just different type of usage. Considering the VERY different fleet layout I'd think using a separate fleet repo like infra-status-shards would make more sense, since trying to square the circle of two very different types of fleets being deployed by the same repo will only make both the Terraform and Ansible configuration needlessly complex.

What do you think?

jm-clius commented 10 months ago

I think those are tied together and are part of the same fleet.

Agreed. I think from my perspective it's about whatever makes most sense from an infra perspective in terms of management. Indeed, these two "subfleets" are connected into a single service fleet, that acts as a unit to provide services to Status Communities, will most probably always be running the same version, be interconnected, etc.

The differences are:

If these differences can be managed from within a single fleet with different node "types", I think it will indeed be much cleaner!

I'd think using a separate fleet repo

Yeah, I'm viewing this as building on the wakuv2.shards example, but with more complexity. Since we are now introducing different node types, I would agree that it's diverged enough to warrant a separate repo.

yakimant commented 10 months ago

We had a call with @jakubgs, @jm-clius and @Ivansete-status, here are the notes:

Stakeholders

Hosts

Timeline

DB

Code organisation

Metrics

Later

Ivansete-status commented 10 months ago

Thanks for the notes @yakimant ! A couple of comments:

yakimant commented 10 months ago

Yes, fleet name will include stage name too, don't worry!

yakimant commented 9 months ago

@richard-ramos, @jm-clius, quick question: why do we need an enrtree DNS record for storage nodes if it is not referenced? Both bootstrap and storage nodes reference bootstrap node list as I understand.

jm-clius commented 9 months ago

why do we need an enrtree DNS record for storage nodes if it is not referenced? Both bootstrap and storage nodes reference bootstrap node list as I understand.

Correct. I think it is so that Community Nodes can reference the store nodes separately from the bootstrap nodes.

jakubgs commented 9 months ago

Hanno adds:

My understanding from Richard is: they bootstrap against a list of nodes and they create a set of store nodes from a different list (in fact, they hard-code this store list currently, but should do a separate lookup to populate this dynamically). If the store nodes are mixed into the bootstrap list, they will automatically form part of the bootstrap process which we'd like to avoid to create a better separation of interests in the fleet services.

yakimant commented 9 months ago

@jm-clius, @Ivansete-status, Please have a look if you can start using fleet for testing already.

What is missing:

richard-ramos commented 9 months ago

@yakimant what is the dns discovery addresses for this fleet?

richard-ramos commented 9 months ago

nvm. Saw the README.md :)

Ivansete-status commented 9 months ago

I validated that the next hosts have both relay and store properly configured. Amazing work @yakimant !

/dns4/store-01.do-ams3.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAmAUdrQ3uwzuE4Gy4D56hX6uLKEeerJAnhKEHZ3DxF1EfT
/dns4/store-02.do-ams3.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R
/dns4/store-01.gc-us-central1-a.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ
/dns4/store-02.gc-us-central1-a.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM
/dns4/store-01.ac-cn-hongkong-c.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT
/dns4/store-02.ac-cn-hongkong-c.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAm9CQhsuwPR54q27kNj9iaQVfyRzTGKrhFmr94oD8ujU6P

To test it, I followed the next steps:

  1. Start a nwaku node locally, setting staticnode and storenode to the first node from previous list. With that, I established the remote relay and store peer, respectively.

    i. Then, I made my local node to publish a few messages by sending json-rpc requests to my local node.

    ii. Then, I made a store request to my local node, through json-rpc, and in turn my local node made a Store request to the remote node. Then, the json-rpc command returned the messages stored by the *F1EfT peer.

  2. I repeated the step 1.ii. with the rest of the nodes. In all cases, the stored messages where properly received. Indicating that all the nodes are relay-connected among them, and also, all properly handle Store requests.

Notes:

cc: @jm-clius

yakimant commented 9 months ago

I can see messages in db:

nim-waku=> select id from messages;
                                id
------------------------------------------------------------------
 2be984aa463c976553b7ab089c6708fa97509449f7359693ae98665fecb1a652
 2be984aa463c976553b7ab089c6708fa97509449f7359693ae98665fecb1a652
 2be984aa463c976553b7ab089c6708fa97509449f7359693ae98665fecb1a652
 f0f81dd5f875ebac8402617ac0c3cd0e7b0c72de4e23cacd23fc7362aacd8b77
(4 rows)
jm-clius commented 9 months ago

Great, I think further dogfooding on this fleet can continue. @richard-ramos

yakimant commented 9 months ago

@jm-clius shall we close the issue or wait for testing results?

jm-clius commented 8 months ago

@yakimant I think we can close this issue and raise any further problems/issues separately. The bulk of the work here has been done. Thanks for your excellent work here!