Block Log Storage Improvements

figitaki commented 3 years ago

We're currently using a simple script that is deployed along each node in our infra that does the following:

Outputs block logs to a seperate stream.
Breaks out each block into a new file named by the state hash of the block.
Attempts top upload to google cloud storage if that state hash has not been uploaded yet.

This has a few issues:

If the first node to upload a block has a serialization error, we lose that particular block.
Traversing this data is tedious and requires lots of manual data digging.
Not very observable in it's current iteration, no alerts if attempting to upload the block logs fails.

We should come up with improvements or alternatives to address these issues.

aneesharaines commented 3 years ago

@figitaki to add some more details so we can decide on a direction

figitaki commented 3 years ago

Here's the proposed directions:

Google Cloud Storage v2

The simplest solution that would not cover all the issues but would be simplest to implement would be to update our current solution to have each node store their block logs in order to prevent data loss. We should also add more logging and reporting to the block log script to improve our observability into the health of the data.

This does not solve our problems for observability and recoverability.

Volume Mount

In order to make loading the block logs into an Archive node instance easier, we could simply write the files to a separate volume that could be attached to an archive container when we need to recover from this data. This would simply extend from the previous solution and would still not provide much improvement in the way of usability for checking data integrity / hygiene.

Postgres Database

Since we already have an existing Postgres instance for the archive node we could move the block logs data into a separate database in the Postgres table which will allow us to query this data using SQL. While this would provide the greatest improvement to usability of this data it would require the greatest lift since we would need to update the recovery path in the actual archive node to support Postgres.

The block logs table would consist of a few metadata fields stateHash, hash, createdAt and the contents as a string.

shimkiv commented 1 year ago

@deepthiskumar can you please make a proper assignment when time allows.

robinbb commented 1 year ago

@deepthiskumar Mina Foundation has arranged to have developers from Granola Systems (https://github.com/orgs/Granola-Team/) help with the archive node. Can @mxnkarou take this issue?

MinaProtocol / mina