bartossh / Computantis

The Computantis is a backbone service for creating secure, reliable and performant solutions for transaction exchange and Byzantine fault-tolerant systems.
https://bartossh.github.io/Computantis/
GNU General Public License v3.0
0 stars 0 forks source link

Increase the number of transactions processing. #200

Closed bartossh closed 11 months ago

bartossh commented 1 year ago

For comparison transactions per second by blockchain implementation:

We need to:

Meeting required.

bartossh commented 1 year ago

Overcoming the transaction throughput.

We are processing around 750 transactions per second by two nodes (1 CPU, 2 GB RAM - for db) and it's clear the DB is a bottleneck. To prove my point run 2 nodes and then increase the number, You will notice that resources are still available on nodes but DB is starting to use a lot of CPU and more RAM. There we have a few options to overcome that restriction:

  1. Migrate from PostgreSQL to Cocrouch DB (written in GO) - Simple as the syntax is the same but DB is run on the cloud. The pros are that:
    • It is scaling automatically
    • Migrations, replication, localization replication etc. are included and managed.
    • Has ACID.
    • Is resilient to failure and data loss.
    • Separate some parts of the DB like the synchronization of the nodes.
  2. Embed SQLite into the binary and make it a persistent local DB and keep PostgreSQL as our main blockchain DB:
    • If the service is down data are in the binary and after it i up the transactions are not retrieved.
    • It is a local persistent cache that will allow for separate queries for transactions calculated by the single nodes.
    • It will be embedded so can be moved with binary elsewhere and shall not add more than 10MB with the default config. A 10-hour run of 1 TRX per second takes around 27MB, so 10MB is exaggerated a lot.
    • Will be cleaned after transactions are part of the blockchain.
  3. Embedded BadgerDB - very fast persistent key-value pair database (written in GO).
    • All present in the (2.)
    • Is faster and even lighter.
    • The downside is that we have to write additional queries.
  4. New service running with node serving as an embedded database (2. or 3.).
    • Guaranties independent storage from the client node in case of a failure that will make service inaccessible like someone taking advantage of the server.
    • Allows for adjusting to the client's needs.
    • Enhances local node scalability, maybe a few nodes can share local db.
    • Adds to complexity.
  5. My favourite is composite. Cockroach DB and Badger DB.
    • Both are solid, concurrent and fast - written in GO.
    • Make the cockroach scale itself according to the network challenges (location, collections, data backup, resiliens, reads, writes etc.)
    • Embed database to the node so data are persistent, but make node redundancy using Kubernetes so one node is a main node serving the traffic and the other one is assisting (saving local transactions). That kind of architecture allows for:
    • Fast response when a node goes down.
    • Data redundancy locally on the node.
    • Can be an opt-in solution that the client needs to pay for.
    • Is a fairly low resource cost: Go binary 8 - 15 MB in RAM on idle.
    • Each node makes a query into the main database (still redundant, sharded fast and location-wise) only when transactions are forged into the block and added to the blockchain.
    • Sync database can be redundant cache speeding the process a lot.
    • The main gains are:
      • Local redundancy
      • Node performance independent of the network and main database (if main db scaling is done properly)
      • Cheaper and less resource-hungry main database.
      • Very powerful for things like encrypted, mac-and-cheese pub-sub.
      • Offering independence from the network for client privacy.
      • Challenging and something that is not offered on the market (in that form) and can guarantee the market demand.
    • The main drawbacks are:
      • More complex local node repository access.
      • More complex networking.
      • More complex network setup.
      • More complex validator - central node communication (we have to inform which node owns the transaction if it is in await state.).
      • Large refactor from the point of view of the central node, less on the validator, and a little on the client node.
bartossh commented 1 year ago

@kubagruszka @kmroz @dmatusiewicz-consult-red What do you think about some proxy repo replication:

SQL LEDGE replication package

bartossh commented 1 year ago

What if we are going to use Redis with high redundancy for all ephemeral data such as:

The high redundancy is offered by the Kubernetes cluster that is going to secure the Redis processes (we can write to all the Redis nodes and read from the less used one)

bartossh commented 1 year ago

Let's not discourage Postgres as a single-store solution. We can use Unlogged tables in Postgres to generate no WAL and are faster to update.

  1. Postgres unlogged table advantages:
    • Massive improvements to write performance (as seen below).
    • Less vacuum impact (because vacuum changes are writes that also end up in the WAL stream).
    • Less total WAL (leading to less traffic, smaller backups, etc.)
  2. Postgres unlogged table disadvantages:
    • Tables are truncated on crash recovery. No durability.
    • Unlogged tables can only be accessed on the primary, not on the replicas.
    • Unlogged tables can NOT be used in logical replication or physical backups.

Unlogged tables may be a valuable use case for:

We will still have tables separated:

bartossh commented 11 months ago

Irrelevant, as we are transitioning to a DAG protocol that will solve most of the issues regarding transaction throughput.