Closed antonlerner closed 1 year ago
still reading through, though one question I have about syncing reference blocks. you said a block is not valid if the reference block can't be found, so to validate the reference block we must find his reference block. this cloud lead to a recursive search of blocks you don't know and might not exist. what security measures should we take to prevent this ?
note about the example you gave. you might need to wait for both atxIDs and txIDs to finish before calling the validateBlock callback using both of them as input. so take this in mind. I believe it can also be solved with the implementation you presented.
overall looks good to me.
lgtm, missing the "price" for half/full sync refac
still reading through, though one question I have about syncing reference blocks. you said a block is not valid if the reference block can't be found, so to validate the reference block we must find his reference block. this cloud lead to a recursive search of blocks you don't know and might not exist. what security measures should we take to prevent this ?
since there should be one refrence block per epoch this shouldn't happen... I dont know if this should be a part of the sloution... but if there are syntacticall valid blocks they can be refrenced by other blocks. the chain will break when we get to a non valid block
Could you add a "background" and/or "motivation" section? I feel like there's some important context that's missing here.
Current sync state today we have the ability to sync a list of ATXs, a list of blocks, and a list of TXs We do not have the ability to sync single blocks and validate them
If we can sync a list, why not just request a list of length one?
needed features we need the ability to sync multiple ATXs
Should this be "sync a single ATX"?
Some more stuff that isn't clear...
if missingIDs(blk.ATXIDS) handleMultipleIds(blk.ATXIDs, validateBlock, func() { log.Error("cant fetch IDS")}) if missingIDs(blk.TXIDs) handleMultipleIds(blk.TXIDs, validateBlock, func() { log.Error("cant fetch IDS")})
Shouldn't handleMultipleIds
be called FetchHash
per your proposal?
What is validateBlock
here? It's some sort of callback? What is its function signature? How does it know the context here (i.e., the current block)? Is it something that exists today, or this is something proposed?
If we can sync a list, why not just request a list of length one?
We can do that, actually the more important thing here is to create a pool of requests that will also be aware of blocks we receive from gossip to avoid syncing on blocks we would receive from gossip
Should this be "sync a single ATX"?
No, we want to sync all ATX in active set list
Some more stuff that isn't clear...
if missingIDs(blk.ATXIDS) handleMultipleIds(blk.ATXIDs, validateBlock, func() { log.Error("cant fetch IDS")}) if missingIDs(blk.TXIDs) handleMultipleIds(blk.TXIDs, validateBlock, func() { log.Error("cant fetch IDS")})
Shouldn't
handleMultipleIds
be calledFetchHash
per your proposal?What is
validateBlock
here? It's some sort of callback? What is its function signature? How does it know the context here (i.e., the current block)? Is it something that exists today, or this is something proposed?
this is what I propose that this sync will look like
requests for same layer can be made to multiple nodes - this will potentially help faster sync since in many cases upload speed for clients is much slower than download
Agree that, as discussed on the call today, this is probably a good idea as it takes better advantage of limited upload bandwidth relative to download bandwidth, makes it harder for one node to block another from syncing, etc., but - how do we do this in a fashion that isn't highly redundant? It seems very wasteful to request the same exact block data from multiple peers. Presumably we'd want to do something like what bittorrent does here, and request different "chunks" of data (e.g., different sets of blocks) from different peers, right? Maybe with some redundancy but not total redundancy.
General
As part of the new ATX flow, we now include ATXs of active set size in blocks, We also include a reference block if a previously created block in same epoch has already declared an active set size.
Motivation
Sync refactoring will allow cleaner code, allowing fast changes and additions when needed. the refactoring will also allow to better batch sync operations and to send different hashes of same layer to different nodes, thus spreading the loads between nodes and allowing better failure handling
Sync requirements
In order to validate a block we must sync and validate the new fields which are
Current sync state
today we have the ability to sync a list of ATXs, a list of blocks, and a list of TXs
We do not have the ability to sync single blocks and validate them
We also sync poet proof from. ATXs
needed features
we need the ability to sync multiple ATXs
we need the ability to sync a single block
proposed methods
Add to current architecture:
fetch_queue
implementation to support single blocks. a quick implementation can look like so:return <-ch }
Use new
HashPool
generic sync methodCurrent performance
optimisations
HashPool
HashPool is context agnostic sync pool that will allow us to batch, prioritise and fetch sync items. The main concept is basically keeping a map of hashes to fetch (and db IDs from where to fetch them) priority and callback. Each item in spacemesh has a unique id, and a database from which it can be fetched. current implementations take into account the context and type of item to fetch, this allows us to batch blocks by layer, and call specific flows This can be done in another way, by specifying sync context.
The proposed solution is to have the sync have a simple API
Gossip message handling
All received messages will be directed to the same pool, with their respective
callBack hashCallback
this will make handling of gossip messages the same way we handle sync messages. we could queue both gossip and sync messages in the same queue and address them by a specific order, hashCallback can beFor example, The function
dataAvailability
for block could be a hashCallBack, i.ecould be written like so
This is a more event driven approach,
Callbacks can be very generic and even hash specific, we can deserialise data and validate it when the bytes come in. this makes the entire handling of synced layers async and event driven. the pros of this method is a very robust system the disadvantage is that perhaps in our system, where data synced needs to follow a specific order of layers, the asynchronous method can delay handling of layers in mesh and complicate it