MinaProtocol / mina

Mina is a cryptocurrency protocol with a constant size blockchain, improving scaling while maintaining decentralization and security.
https://minaprotocol.com
Apache License 2.0
1.98k stars 525 forks source link

Archive Resilience #7111

Closed figitaki closed 2 years ago

figitaki commented 3 years ago

Background

Traditional blockchain projects carry around their history — it's the blockchain. Any fully synced node has the information necessary to audit balance transfers on the protocol. The drawback is that most clients don't care about this history. Mina uses zk-SNARKs to compress the blockchain down to a constant size. We effectively forget the specific balance transfers that occurred at a given time. This is great for the common case, but bad in the special cases. We created the (optional) archive node process to store this history in the Mina protocol. Our archive node attempts to store all the blocks that Mina sees as valid inside a PostgreSQL database.

This is great when things are working, but sometimes things go wrong — nodes can go down, bugs can creep up that break our integration. The protocol itself is resilient to such scenarios, but currently our archival infrastructure is not.

This is unacceptable for main net launch. Full archive data is important for:

  1. Clients to audit the balance transfers on chain — this is important for custodians, exchanges, and professional node operators (via Rosetta or manually)
  2. Hard forks — We take advantage of the archive node to mold our new forked genesis ledger into a usable state

We must have a good story for bootstrapping, maintaining, and recovering archive data for these purposes.

Prior Art

  1. In the past, we deployed a "points-hack" service that dumped GraphQL block JSON to cloud storage.
  2. Matthew created a tool a while back that can replay blocks from a log file.
  3. Luckily, with the help of community members such as (@ Gareth) we are able to, with some effort, recover most data on testnets when we notice some of our storage failed.

Outstanding Problems

Proposal

Tackle resiliency with redundancy. Specifically we should be redundant across two dimensions: (1) horizontal scaling of archiving processors and (2) additional sources for recovering data into the database:

  1. Scale up the existing archive node processors in our infrastructure. This amounts to just running more than one archive node processor on more than one node in our cluster. PostgreSQL will handle concurrent duplicate writes idempotently for us. Detect missing sub chains.
  2. Add support for recovering block data from both GraphQL data and logs.

Regarding the additional sources of block data, we have already done work to recover node data from GraphQL: see this PR. We have also built out the support for logging block data that is sufficient to recover from: see this PR.

Since both the GraphQL data and the new block logging format are in JSON, the simplest and most resilient way for us to store them would be to pipe them into MongoDB. This way we don't need to handle merging the data together or dealing with converting from JSON into SQL or any other typed format.

Then if we reach a point of catastrophic failure of the primary archive node, we can then fallback to one of our backup data soruces and recover from them in the following order:

  1. Archive Node
  2. Block logs
  3. GraphQL dump

To address the issue of changing the schema along side a hard fork, we should always decouple schema migrations from hard forks. This allows us to migrate all existing archive databases well before a hard fork.

Tasks and Projects

Open Questions

bkase commented 3 years ago

How should we handle migrations to the archive schema?

We mentioned this during our discussion, but we should always migrate before a hard fork. I think it's worth capturing that explicitly. We can share the migration scripts before we release a hard fork as well.

What level of ongoing support & documentation do we need to provide for the block logging format?

I think we should share a sample + explain the purpose for this redundant data, but explain that the representations here map to our implementation details and are subject to change at any time without notice. For a more backwards compatible and safer reliable source, use GraphQL.

More tasks to add:

nholland94 commented 3 years ago

I think this is a great solution to storing our archive backups.

One note: I think we should explicitly deploy separate database instances for each of the resiliency sources. As in, one for the points hack, and one for the block logs. That way, if one goes down for whatever reason, it does not directly compromise the integrity of the other backup system.

garethtdavies commented 3 years ago

I would add to the above that having others running an archive node would obviously help as this would be a simpler recovery source? This information available to do this is currently really hard to find.

bkase commented 3 years ago

TODO: Add details around our temporary google cloud storage writing solution and start a discussion around the tradeoffs between keeping that and doing MongoDB as specified

psteckler commented 3 years ago

The block logging format has much more detail than the output of the missing subchains tool. So there isn't really commonality to exploit, for the purposes of getting block information into an archive database.

We'll need separate mechanisms to ingest that information from these sources into an archive db.

/ping @bkase

psteckler commented 3 years ago

I think the phrase "belt-and-suspenders" should appear in this RFC somewhere.

yourbuddyconner commented 3 years ago

Just to weigh in here from the professional operator perspective, it is less than optimal to require two different databases (or just multiple sources of block data) to provide redundancy for the Archive process. This also introduces the added level of complexity if the archive and/or JSON Block schema is ever mutated down the line and multiple migrations (one for PSQL, one for Mongo) have to be managed.

I really liked what I was reading from Deepthi here, and was hoping to see more development on making the Archive process more resilient to downtime as opposed to adding ways to reconcile the data later when you inevitably have downtime.

Either way, good RFC describing the tradeoffs @figitaki.

p-shahi commented 2 years ago

Most of these items have been addressed in recent tooling