hyperledger / indy-node

The server portion of a distributed ledger purpose-built for decentralized identity.
https://wiki.hyperledger.org/display/indy
Apache License 2.0
686 stars 657 forks source link

Using docker swarm #1614

Closed SalimiHabib closed 1 year ago

SalimiHabib commented 4 years ago

Is it possible to to run indy nods in docker swarm for multi host pool ?

dhh1128 commented 4 years ago

Salimi: Indy nodes are run with plain Docker all the time. Instructions on how to do this are found in the INDY SDK, here: https://github.com/hyperledger/indy-sdk#how-to-start-local-nodes-pool-with-docker

However, we have not attempted to run a pool using Docker Swarm. The reason is that the pool has its own high-availability logic, and trying to combine it with another HA strategy is problematic. The two tools would interfere with one another. (Indy pools assume a stable IP address+port for each node, and assume once they have received an ACK of a received message that a node has persisted the state it is responsible for. This makes hot swapping of nodes a bit problematic.)

SalimiHabib commented 4 years ago

Daniel Hardman: Based on the points you mentioned, it seems in addition of swarm , using indy behind load balancer (such as envoy) , may be cause problems ambiguities or drawbacks( because of HA strategy in indy). If so, is there any solution?

Thank you for your time

On Mon, Aug 24, 2020 at 9:17 PM Daniel Hardman notifications@github.com wrote:

Salimi: Indy nodes are run with plain Docker all the time. Instructions on how to do this are found in the INDY SDK, here: https://github.com/hyperledger/indy-sdk#how-to-start-local-nodes-pool-with-docker

However, we have not attempted to run a pool using Docker Swarm. The reason is that the pool has its own high-availability logic, and trying to combine it with another HA strategy is problematic. The two tools would interfere with one another. (Indy pools assume a stable IP address+port for each node, and assume once they have received an ACK of a received message that a node has persisted the state it is responsible for. This makes hot swapping of nodes a bit problematic.)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hyperledger/indy-node/issues/1614#issuecomment-679242408, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFP2TGIE4M3GBBZT44SEP2DSCKKTNANCNFSM4QIXSWQA .

dhh1128 commented 4 years ago

An individual node of Indy isn't supposed to be highly available, any more than an individual stripe of RAID-ed storage is supposed to be highly available. The entire Indy pool is highly available as a unit. That is what the byzantine fault tolerant consensus algorithm of the pool guarantees. It is a very strong high-availability guarantee; it is highly available not just in the face of network downtime and brownouts, but also in the face of hacking, malicious sysadmins, and so forth. In a given pool, if there are 3*N+1 nodes, then up to N can be down or malicious and the pool will remain usable and tamper-proof. So for a pool of 25 nodes, for example, 8 can be down at a time. Up to 2N (2/3 of the entire pool) can be down or hacked and remain tamper-proof. And even if all but one node is hacked, tampering will still produce evidence and warnings. Additionally, all responses from the pool contain a state proof that prevents a malicious node from lying about the state of the blockchain, ever.

In order to achieve this guarantee of byzantine fault tolerance, the nodes in the pool must talk to one another, and they must know what each other node's committed state is. Putting an individual node behind a load balancer would make that node more highly available, but it would subvert the knowledge that the rest of the pool has about the state of that node, because the state of the node behind the load balancer could change at any time due to failover. It is better to let the node fail than to try to keep it up all the time. When a node fails, all the other nodes in the system discover the failure, and compensate for it. When a node comes back online after an outage, Indy has a catchup mechanism that brings it back into synchronization automatically. The rest of the world (the clients of the blockchain) don't ever need to know or care about any of this.

Putting the pool as a whole behind a load balancer would not be helpful, either, because clients of a pool must talk to more than one node in the pool to submit a write transaction. This happens automatically and transparently if you are using libindy in your client software; if you submit a transaction to a pool with, say, 50 nodes, libindy automatically submits the transaction to 18 different nodes to guarantee that at least one healthy node sees it. The transaction request is then propagated in a tamper-resistant way to all the other nodes of the cluster, such that all the nodes (even potentially malicious ones) must acknowledge its legitimacy and certify that they've made the correct update to the blockchain.

This is a long way of saying, "I don't understand what problem you are trying to solve when you ask, 'is there any solution'"? To the best of my knowledge, all meaningful problems of high availability and high security are already solved by Indy's design, so combining it with generic solutions to either of these problems seems like A) a waste of effort; B) unnecessary complexity; and C) likely to degrade Indy's performance or complicate its deployment.

SalimiHabib commented 4 years ago

Daniel Hardman: it is all about money Currently we are using Microsoft orleans (as back-end database with its own HA) and signal R as front-end ,this package will run behind envoy proxy servers . orleans dose not have any public access and completely relay on signal R (with have a good adaptation by envoy) but we have new players ( Indy in server farm and Aries in server and public ) , and that we have to think about how their bolts fit into the current structure. there is a lot of question about this modification and we are trying to simulate a lab to test situations that can happen during the service , we hope we do not use extra hardware and maintenance for new players .

You guys have done a great job in the industry , it is essential .

we start to analyze the environment and dose not have any specific question , but my request about solution is something that happened before or any generic guide that can save our time(we are just started).