IntersectMBO / cardano-db-sync

A component that follows the Cardano chain and stores blocks and transactions in PostgreSQL
Apache License 2.0
290 stars 162 forks source link

Haskell CI Discord

Cardano DB Sync

Note: Anyone wishing to build and run anything in this repository should avoid the master branch and build/run from the latest release tag.

Purpose

The purpose of Cardano DB Sync is to follow the Cardano chain and take information from the chain and an internally maintained copy of ledger state. Data is then extracted from the chain and inserted into a PostgreSQL database. SQL queries can then be written directly against the database schema or as queries embedded in any language with libraries for interacting with an SQL database.

Examples of what someone would be able to do via an SQL query against a Cardano DB Sync instance fully synced to a specific network is:

Example SQL queries are available at Example Queries. You can also find some DB Sync best practices here.

Architecture

The cardano-db-sync component consists of a set of components:

The db-sync node is written in a highly modular fashion to allow it to be as flexible as possible.

The cardano-db-sync node connects to a locally running cardano-node (ie one connected to other nodes in the Cardano network over the internet with TCP/IP) using a Unix domain socket, retrieves blocks, updates its internal ledger state and stores parts of each block in a local PostgreSQL database. The database does not store things like cryptographic signatures but does store enough information to follow the chain of blocks and look at the transactions within blocks.

The PostgreSQL database is designed to be accessed in a read-only fashion from other applications. The database schema is highly normalised which helps prevent data inconsistencies (specifically with the use of foreign keys from one table to another). More user friendly database queries can be implemented using Postgres Views to implement joins between tables.

System Requirements

The system requirements for cardano-db-sync (with both db-sync and the node running on the same machine are:

The recommended configuration is to have the db-sync and the PostgreSQL server on the same machine. During syncing (getting historical data from the blockchain) there is a HUGE amount of data traffic between db-sync and the database. Traffic to a local database is significantly faster than traffic to a database on the LAN or remotely to another location.

When building an application that will be querying the database, remember that for fast queries, low latency disk access is far more important than high throughput (assuming the minimal IOPS above is met).

How to Contact the Cardano DB Sync Team

You can discuss development or find help at the following places:

Installation

Install db-sync with one of the following methods:

Once installed, start db-sync by following the Running Guide.

Troubleshooting

If you have any issues with this project, consult the Troubleshooting page for possible solutions.

Further Reading