rpcpool / yellowstone-faithful

Project Yellowstone Old Faithful is the project to make all of Solana's history accessible, content addressable and available via a variety of means.
https://old-faithful.net/
GNU Affero General Public License v3.0
70 stars 16 forks source link

Change metadata storage #144

Open linuskendall opened 1 month ago

linuskendall commented 1 month ago

Currently we store metadata part of the main DAG. This has a few issues:

  1. The metadata is not strictly validated data so may be different between two different solana versions even processing the same slots (in theory?)
  2. If there is corruption of metadat such as is the case #129 then we would need to re-generate the whole tree with al lthe block data.

The impact of storing the metadata in the DAG could potentially be an issue for a future proof-of-storage setup?

Proposal is:

Separate out the metadata from the current DAG. See #143 for an example idea of how this could be done.

We would move the non-validated metadata to a separate struct. In the change structure in #143.

This would mean that you would need to fetch at least two CIDs (transactionmeta and transaction) to fetch the full transaction.

tasks

impacts

fixing missing meta for old epochs

for older epochs we could generate just the transactionmeta data objects we need. alternatively we could generate

backwards compatibility

we could opt to just keep the same objects we have now, but just when generating we would simple leave the rewards/metadata fields empty. if the field is empty then we know that the metadata exists in a separate tree.

when reading we could just ignore the rewards/metadata fields of existing objects and instead just read the metadata from the sepcial metadata CID.

this woudl allow us to be completely bakcwards compatible. we could then proceed to generate metadata trees for all past epochs and upload these separately.

Another option would be to create a new Block type which excludes the rewards field and a new Transaction type which excludes the metadata field. This would be a bit neater because we wouldn't have the old type around. However, the tooling woudl then need to support the two different "kind"s of Transactions and Blocks and be able to just parse it based on the kind field which one to use.

unresolved

how to use this with the standard file coin tooling to fetch a transaction with its transaction meta?

we could use just the transaction meta cid and then inside the Transactionmeta object either have a link (is this even allowed?) to the transaction object OR have a text string of the CID.

in this way someone could fetch the TransactionMeta object and then continue to fetch the transaction data too.

subset object being meta dependent?

if we want to create some kind of PoS setup, then would it be better if the Subset object was replicable?

I guess maybe it doesn't matter. since we would probably anyway need something merklized to be able to quickly verify if a specific TX was stored or not.

linuskendall commented 1 month ago

image