Closed Pitasi closed 2 years ago
I marked scenarios 1 and 2 as simple because we already have the "upsert" SQL queries that update the rows only if the height
column in the database is lower then the height
we want to write (note: this version has not been released yet).
I marked scenario 3 as hard because the bulk import only "sees" the current state of the cosmos sdk database (application.db). Therefore, bulk import doesn't see any delegations at all at block height 100, we would need a way to match the "extra data" we have on the database and delete/soft delete that row.
@Pitasi Thank you, your issue helps me a lot to better understand the problems that are supposed to be solved by the TL.
Some questions :
TL listens to ABCI using a webocket if I'm not wrong, so how do you extract the bulk import ?
What's a node exactly ? I though it was a TL instance dedicated to a blockchain, but I have a doubt after reading your scenarii since a node can have a different block-height than cockroachdb...
what does that mean when you say the node is on block XXXX.
About scenario 3, so if I understood correctly there's no way to detect a delete in the cosmos sdk db, other than comparing the state with the one we have in cockroachdb ? It would be so much easier if the cosmos sdk db exposes events like an Apache Kafka :).
We probably are going to need a proper tracelistener lesson and also take the chance to improve our Notion pages 😛
Cosmos SDK writes "traces" to a file, it is supposed to be a debug feature. Each trace represents an operation to the underlying Cosmos SDK database, we discard "read" traces and only keep "write" and "delete" traces.
Keep in mind that Cosmos SDK storage is simply a key-value storage.
Our goal is basically try to replicate the Cosmos SDK db into our CockroachDB instance.
That said, let me answer your questions, but we can schedule a meet for tomorrow:
TL listens to ABCI using a webocket if I'm not wrong, so how do you extract the bulk import ?
Forget ABCI, we are at a much higher level here :) (ABCI is about Tendermint Core, the consensus protocol that executes a state machine, but doesn't say how you store data or what operations you can perform on this data)
Traces are operations on db made by Cosmos SDK modules. Bulk import basically reads the database from files and internally generates "fake write traces" so we can reuse the same Tracelistener code.
What's a node exactly ? I though it was a TL instance dedicated to a blockchain, but I have a doubt after reading your scenarii since a node can have a different block-height than cockroachdb...
For me a node is an instance of a node of a blockchain. For instance in production we have 4 osmosis nodes, each of them have a Tracelistener instance running. Important note: they all share the same database and concurrently compete to update the same rows.
It happens that some of the nodes crashes, or stop syncing for a while (think for example of a temporary network failures to one of our kubernetes nodes). After this node come back, it will begin download all the data it missed in this amount of time, and emits the same traces its siblings already emitted (hence cockroachdb was already updated).
what does that mean when you say the node is on block XXXX.
I mean we stopped the blockchain node when it was at that height. So it's internal database represents the state at height XXXX.
About scenario 3, so if I understood correctly there's no way to detect a delete in the cosmos sdk db, other than comparing the state with the one we have in cockroachdb ? It would be so much easier if the cosmos sdk db exposes events like an Apache Kafka :).
We can detect "delete" traces, they are a delete event, but we don't want to process historical events like we could in Kafka (we can for example spin up a new blockchain node and let it sync from scratch, this will emit all events again, but it's a painfully slow process since there are millions of blocks and each block emits many events).
What we want is to be able to "jump" to block height XXX without parsing all previous blocks, hope that makes sense.
Thanks a lot @Pitasi , yes I probably need more information, I won't say no to a meet on that topic :)
@Pitasi let's arrange a team wide call/knowledge sharing session on TL and SDK DB.
I think we discussed about possibilities and arrived to an algorithm that works so I opened a ticket to track the implementation of it: https://app.zenhub.com/workspaces/emeris-back-end-619cd009cc778000112d6259/issues/emerishq/demeris-backend/784.
I'll close this issue, thank you!
We are currently running the bulk import on the assumption that the database is clean: e.g. if we bulk import
osmosis
we don't expect to see anyosmosis
row in any of the tables (balances
,auth
, ...).We want to investigate IF it would be possible to launch bulk import on a non-clean database, while other nodes are running (i.e. upserting the same rows for that chain).
Scenario 1 (simple)
Let's imagine an address
abc
that had 123 osmo at block height 12, then received 500 osmo at block height 55.We launch a bulk import on a node that was on block height
100
.Note how we wrote
100
instead of55
in theblock_height
column. There's no way to simple retrieve that information during bulk import so for simplicity we just use the height of the bulk import. This is ok as that balance is valid for block 100 too.Scenario 2 (simple)
Same as scenario 1: Let's imagine an address
abc
that had 123 osmo at block height 12, then received 500 osmo at block height 55.We launch a bulk import on a node that was on block height
100
. For some reason our db was more updated than the bulk import, so we expect this row to stay the same.Scenario 3 (hard)
Address
abc
delegated to a validator123usmo
at block height12
. That delegation is deleted at block height55
(assume there's no unbonding time for simplicity).We launch a bulk import on a node that was on block height
100
.Expected Cockroach DB state (any of these would work):
delete_height is the correct block height
delete_height is the bulk import height
the row is fully deleted (pruned)