threefoldtech / grid_deployment

Deploy a full Grid backend with docker-compose and snapshots
Apache License 2.0
2 stars 0 forks source link

Validator Code Deployment #52

Closed Mik-TF closed 1 month ago

Mik-TF commented 6 months ago

Todo

Related issue

coesensbert commented 6 months ago

so here we want to be able to deploy a tfchain validator with docker compose and a script (if needed) ?

Mik-TF commented 6 months ago

I think you got it right. If I get more info I'll let you know.

Mik-TF commented 6 months ago

So in short:

coesensbert commented 5 months ago

https://github.com/threefoldtech/grid_deployment/commit/c002bc4f9beb06d699b7e5710100fbb47515e404

needs be checked by dev for validator flags and tested by ops. Will probably also need some more logic in the script

coesensbert commented 5 months ago

as a test, I was able to fully sync a new tfchain node from 0 with these flags. https://github.com/threefoldtech/grid_deployment/blob/development/tfchain-validator/mainnet/docker-compose.yml

image

verifying with dev if we use the correct flags and if our old procedures are fine: https://github.com/threefoldtech/tfchain/issues/981

coesensbert commented 5 months ago

made something that can make inserting keys a bit simpler while setting up a validator, need to confirm with dev if this could work. https://github.com/threefoldtech/grid_deployment/commit/146364d1c8d6462305d925a12ab94677decd8b05

coesensbert commented 5 months ago

added required files to test adding/removing a validator on devnet https://github.com/threefoldtech/grid_deployment/commit/1ddd96479a07c0fd75d9cf88765ff567d6af9a6f

Mik-TF commented 5 months ago

@xmonader Could you have the dev team check this and this, to see if it works as Bert explained above?

Thanks!

xmonader commented 4 months ago

@sameh-farouk please follow up to make sure the validator is packaged/configured and deployed correctly

sameh-farouk commented 4 months ago

@xmonader @Mik-TF @coesensbert Could you please clarify what you mean by "TFChain validator"? Are you referring to a TFChain stack (including RPC node, GraphQL API, Dashboard, etc.), a TFChain public RPC node (used to access the underlying network), or a TFChain validator node (a block author)? It is not reasonable or even feasible to include the latter.

Mik-TF commented 4 months ago

I think the answer to your question is:

The idea is that a validator will run the full TFGrid stack (tfhub, tfbootstrap, dashboard, graphql api, etc.).

So far we have the procedures and documentation to run the basic grid stack (see manual here), and now Bert is working on adding TFHub and TFBootstrap.

I think for now Bert wants the dev team to have a look at the tf-chain-validator so far, for both main and dev nets:

I hope it helps!

sameh-farouk commented 4 months ago

This is still unclear as the TFChain stack, including the RPC node, was mentioned, as well as links for running the chain validator node.

It is important to note that automating the addition of a new validator node in POA (Proof of Authority) networks is not common, as governance is required for this task. It cannot be fully automated due to the involvement of a council flow. Unlike PoS (Proof of Stake) networks, POA networks lack bonding requirements that can protect them from malicious actors. Therefore, allowing anyone to start a validator node in the POA network is not considered reasonable.

Let's clarify the difference between an RPC node and a validator node to clear any confusion between these two types of nodes.

RPC Node: An RPC (Remote Procedure Call) node in a Substrate-based blockchain network primarily serves as an interface for clients to interact with the blockchain. It provides endpoints for querying blockchain data and submitting transactions. RPC nodes participate in the network by maintaining a copy of the blockchain, and relaying information to other nodes, but do not partake in block production or consensus. These nodes are typically used for decentralized applications, wallets, and services needing to interact with the blockchain.

RPC nodes can be added or removed with minimal impact on the network. Their primary role is to facilitate interaction with the blockchain rather than maintaining its security or producing blocks.

Validator Node: A Validator node is responsible for participating in the consensus process of the blockchain. In a Substrate network using Aura (Authority Round) consensus (like our chain), validator nodes produce and propose new blocks in a round-robin fashion to the network. Validators are crucial for maintaining the security and integrity of the network. They validate transactions, produce blocks, and participate in consensus. Validator nodes have a low tolerance for downtime due to their role in block production and consensus. Adding unreliable validator nodes can significantly affect block production, consensus stability, and network security.

TFChain stack In terms of the TFChain stack, including RPC nodes makes sense as they can be added and removed with minimal impact on the core network operations, mainly affecting the availability and performance of client services connected to the same stack. However, this is not the case for validator nodes, which are critical for whole network security and block production. Unmoderated changes in the validator set can disrupt the consensus process, lead to missed blocks, degraded performance, and open the network to security issues.

sameh-farouk commented 4 months ago

Another important point to remember is that you cannot use existing author/validator keys on the chain to start a new validator instance. While this would skip the need for council approvals (because the keys are already on-chain), it will result in conflicting blocks with existing validators that use the same keys.

coesensbert commented 3 months ago

thx @sameh-farouk , I agree we need better defined naming convention for all our components.

Did someone have a look at what was proposed to let someone deploy a validator with compose? No feedback was received. I think a dev will understand just by looking at it. With this we need to make and test an easy procedure for someone to deploy a tfchain validator and get approved by a council

xmonader commented 3 months ago

Sameh is off this week, he can take a look at it next week or @renauter if he has some availability

Mik-TF commented 3 months ago

Thanks for the info @xmonader.

@sameh-farouk do you have any questions to go forward? Thanks!

Mik-TF commented 2 months ago

@coesensbert can you give us a status on this? @sameh-farouk do you have some time to check this? or perhaps @renauter

Thanks!

coesensbert commented 2 months ago

update: created and tested all validator compose stacks. Needs a quick re-test with snapshot data once ready: https://github.com/threefoldtech/grid_deployment/tree/development/tfchain-validator/

snapshot creation ongoing:

procedure and docs ongoing based on this pr: https://github.com/threefoldtech/tfchain/pull/1007

coesensbert commented 1 month ago

wip

Mik-TF commented 1 month ago

Great! Let us know if things are blocking or if you need help troubleshooting/testing stuff.

Mik-TF commented 1 month ago

Update from Bert

coesensbert commented 1 month ago

docs finished: https://github.com/threefoldtech/grid_deployment/tree/development/tfchain-validator

Mik-TF commented 1 month ago

Looks great! Let us know how the testing goes.

coesensbert commented 1 month ago

tested with a real validator on mainnet, seems to work: https://github.com/threefoldtech/tf_operations/issues/2812 Maybe we should add a bit more clarity in the docs which keys they have to keep, because they not need every key they create to set it up but for safety they need to keep them. Which is probably confusing currently

Mik-TF commented 1 month ago

Nice! Good to know.

I made an issue about keys management so we update the docs as you proposed: https://github.com/threefoldtech/grid_deployment/issues/80

coesensbert commented 1 month ago

Final docs finished: https://github.com/threefoldtech/grid_deployment/tree/development/tfchain-validator

tested and done

Mik-TF commented 1 month ago

Amazing. Then I am closing the issue! We can create new issues if adjustments are needed. Thanks the the great work.