Closed Wondertan closed 1 month ago
This is relatively urgent for us and if there is no capacity from your side to implement this, we could submit a patch as well
@Wondertan Preliminary investigation does not indicate that there is any specific reason that GetSize
was used over Has
so I will see if it looks reasonable to get this done now.
Triage notes:
Blockstore.Has
sounds very sensible (presence check often cheaper/billed differently than reading size)maxBlockSizeReplaceHasWithBlock
configurable, @gammazero will look into this I would also take a look at how cached blockstores(like bloom) would behave with this change.
It is true that Want_Have
requests rely on GetSize
to check for presence of the requested block and the size gets discarded. So, using Blockstore.Has
for Want_Have
requests does offer an slightly faster lookup, however...
it also risks having to do both the Has
and GetSize
lookup for the same CID since it is possible for there to be both Want_Have
and Want_Block
requests for the same CID. This is why all the CID's from the wantlist are put into a unique set and then the sizes are looked up for al of them. This way there is only one query per CID, just they are all size lookups.
I think there are other improvements that can probably have a more significant performance improvement. Specifically the use of worker pools and communication between these goroutines can be streamlined. Additionally, high frequency metrics updates may also contribute non-trivial overhead. Improvements will be supplied as more profiling is done.
it also risks having to do both the Has and GetSize lookup for the same CID since it is possible for there to be both Want_Have and Want_Block requests for the same CID.
@gammazero, it possible for both requests to come for the same CID from a single peer? Or do you mean they are deduplicated across multiple peers?
Gentle ping @gammazero.
We would like to understand the reasoning behind the risk a bit more.
Last ping @gammazero
@Wondertan This is only across different peers, AFAIK. Any single peer should only issue a WantHave or a WantBlock for a CID.
Some things to consider...
A bitswap message may have a number of CID entries. Some are WantHave
and some are WantBlock
. Some WantHave
s may be converted to WantBlock
if the blocks are small enough
Using blockstore.Has
for WantHave
requests is only doable when automatically converting smallWantHave
requests to WantBlock
is disabled (when e.maxBlockSizeReplaceHasWithBlock
is zero). Getting block sizes is necessary to see if a block is small enough to do the conversion.
Getting block sizes is done as a concurrent batch, which is generally faster for more than a very small number of blocks, although it may also depend on backend store, cached values, etc. This means that we will want to separate the WantHave
s from the WantBlock
s in a bitswap message, so that those can both be done in concurrent batches.
At a later time the peer client may decide to send a message to retrieve the blocks for which WantHave
responses were received. Now the bitswap server needs to get the block sizes for all those blocks, because it did not get the sizes previously. This is where the "extra" query happens.
I coded a prototype corresponding to the above description, and did not see a significant performance improvement. However, I may not have been testing at sufficient scale and/or was using a similar storage backend to what you are using.
Given specific storage backends, it may make a very significant difference in performance and/or cost to avoid reading block sizes. I propose two changes:
maxBlockSizeReplaceHasWithBlock
configurable. Setting to zero would disable automatic conversion of WantHave
to WantBlock
for small blocks.@Wondertan Here are the prototype and kubo configuration:
@gammazero, that's a great solution and fixes our concern! Thank you ❤️
Should we start testing the prototype now or after your PR gets ready for review?
@wondertan You can start testing it now, and any feedback will be helpful. We would like to review and test in our cluster soon, but have not yet so cannot make any guarantees about how it works.
Currently,
WANT_HAVE
requests rely onGetSize
to check for presence of the requested block and the learnt size gets discarded.Basically, the Blockstore has to lookup data size that's unused. Depending on
Blockstore
implementation, this might have different consequences. Intuitively, looking up data size should be more expensive compared to checking data existence. In practice, this is what happens in our case.Hence the proposal is to add new method to blockstore manager that performs only the Has check and refactor
ReceivedMessage
to decouple processing ofWANT_HAVE
andWANT_BLOCK
messages.