ethereum / go-ethereum

Go implementation of the Ethereum protocol
https://geth.ethereum.org
GNU Lesser General Public License v3.0
47.54k stars 20.13k forks source link

Research transactional database support #22143

Open ligi opened 3 years ago

ligi commented 3 years ago

This issue is about exploring databases with a transactional API. Go-ethereum uses leveldb, which does not have a transactional API, and almost all database accesses are direct. This is a simple programming model, but does come with some downsides: if geth crashes in between writes, the database may be left in an inconsistent state. This is usually no problem because the writes are ordered in the correct way, but we have occasionally seen corruption issues that might be related to this.

Restructuring the database accesses around a transactional API would also allow us to check out other embedded databases such as boltdb, badger or LMDB.

DGKSK8LIFE commented 3 years ago

I know this is far out, but relational db support also interests me. Maybe we can take a look at MySQL?

harkal commented 3 years ago

A good candidate to consider is probably EbakusDB. It's a very lightweight and performant DB that I implemented for Ebakus (which is based on go-ethereum code). From the readme file:

_Each smart contract in ebakus has its own schema defined database (ESDD). This database can support any number of tables with typed fields and indexes. A smart contract is able to perform the following operations on the data:

Create/Drop tables Create/Drop indexes on specific fields Retrieve/update/delete single or multiple rows of data. Do ordered range queries on these data. The ebakus software makes sure that the data are stored in such a way in order to support the above operations in the most efficient way. The smart contract should not need to implement most common query types by itself.

The EbakusDB layer is providing to the ebakus blockchain a very fast database layer that supports O(1) time and space complexity snapshots. This is essential to the operation of a blockchain system that has requirements for querying old block states. The database achieves high performance by being aware of the transactional log functionality that the layer above it is using and not reimplementing it itself. Therefore achieving ACID compliance without sacrificing performance.

Smart contracts deployed in Ethereum compatibility mode will not be able to make use of the ESDD, hence will not be able to benefit from the extra functionality and performance._

The introduction post I did at an early stage of the project: https://harkal.medium.com/ebakusdb-a-database-for-blockchain-systems-168339d5010c

DGKSK8LIFE commented 3 years ago

If you want a scalable key-value store, maybe etcd is a good option. Kubernetes is built off of it and it's purpose built for distributed systems.

ligi commented 3 years ago

related #15717

fjl commented 3 years ago

Every time we talk about this, @karalabe voices concerns regarding two things:

It would be nice to narrow the scope of the initial prototype to something smaller than 'all DB operations in Geth'. The part of core where safety matters the most be the state committer (in https://github.com/ethereum/go-ethereum/blob/master/trie/database.go). We could try to create a prototype where DB transactions are only used in this part, for example.