Add missing data - Githubissues

tdroxler commented 4 months ago

Big PR, but don't freak out, lot's of function moving and adding fields to tables.

There are 3 topics in this PR:

Remove the `block_deps` table

when first started to implement the explorer-backend, the idea to store the deps: ArraySeq[BlockHash] was to use the traditional sql way, with a separate table with one row for each element of the array, so we could make some query and join on it. It actually happens that we use that table just to get the deps when getting a block, so the deps could simply be stored as a serialized data in the block table. First it's more efficient to fetch a block, no need to query another table, but morever...... the block_deps table was actually our biggest table :scream: :scream: :scream:

I'm still syncing from scratch a new db with this PR to see the difference, but for sure the serialized data in the block table should be much less than the block_deps table.

Rework model of `/block/<hash>` endpoint

In order to make block pages more efficient, the /block/<hash> was returning the BlockEntryLite version and the transactions were then fetched with /block/<hash>/transactions which has pagination. I think it's good to keep that way, as querying transactions is an heavy task. In this PR the endpoint now returns BlockEntry. I added all missing informations, like ghostUncles, txsHash, nonce etc, except the transactions so they are still fetched with the dedicated paginated endpoint. This is all backward compatible as it's just adding fields, no one is removed.

So if a user want some info on a block, but doesn't care about txs, he can efficiently query all information with that endpoint.

Make sure we always store all block data

That's the part that bring a bit of noise in this PR, as I added every missing data in all our models to make sure we have everything in DB. I then created a test that makes sure we can convert the Block coming from the full-node, into our DB models, then to the explorer-backend API models and from those one back to the full node models. That's all the toProtocol and toApi functions

So now we should be able to provide any data requested by user.

Take your time, I'm anyway re-syncing locally to make sure everything is correct.

polarker commented 4 months ago

@h0ngcha0 @Lbqds Please help me review this with priority!

tdroxler commented 4 months ago

Are uncles blocks guaranteed to be stored on every full-node? if yes we could add a mechanism to download them individually with /blockflow/main-chain-block-by-ghost-uncle/{ghost_uncle_hash}

Yes, uncle blocks are part of the chain state.
/blockflow/main-chain-block-by-ghost-uncle/ only returns the mainchain block hash. You could treat uncles just like deps and query them just like missing deps.

tdroxler commented 4 months ago

We would need a complete resync for the existing explorer, right?

yes

tdroxler commented 4 months ago

@polarker when we insert a block, we check if there is a list of uncles and for each hash of that list we fetch the uncle block and insert it in DB.
At that moment we could flag all those blocks with the isUncle flag, maybe even with the block hash that defined that uncle list? or it's more complicated than that?

polarker commented 4 months ago

@polarker when we insert a block, we check if there is a list of uncles and for each hash of that list we fetch the uncle block and insert it in DB. At that moment we could flag all those blocks with the isUncle flag, maybe even with the block hash that defined that uncle list? or it's more complicated than that?

isUncle is as complicated as isMainChain. An uncle block might become a mainchain block after a reorg.

A block can have the following statuses, which can change due to reorgs:

On the mainchain
As an uncle of a mainchain block
Orphan block: neither on the mainchain nor as an uncle

We could handle uncle blocks and make them aware of chain reorgs in two ways:

On-Demand Query:
- Don't handle the status preemptively and query the uncle status on demand.
- In the front-end, when querying for a block and finding that it's not on the mainchain, we then try to query the mainchain block that contains it as an uncle.
- If such a mainchain block is found, then it's an uncle block. Otherwise, we treat it as an orphan.
Background Service:
- Have a background service that regularly updates the uncle status for new blocks that are finalized with enough confirmations.
- This service will keep track of the status changes due to reorgs and ensure the uncle status is up-to-date.
We could take approach 1 for now and switch to approach 2 later on.

alephium / explorer-backend

Add missing data #550

Remove the `block_deps` table

Rework model of `/block/<hash>` endpoint

Make sure we always store all block data

alephium / explorer-backend

Add missing data #550

Remove the block_deps table

Rework model of /block/<hash> endpoint

Make sure we always store all block data

Remove the `block_deps` table

Rework model of `/block/<hash>` endpoint