Open MichaelMure opened 1 year ago
On an intuition level, this has a similar vibe as having the caller side of bitswap adding blocks in storage (https://github.com/ipfs/go-bitswap/pull/571), which helped splitting client and server side, among other things.
cc @Jorropo for visibility, I remember you were suggesting refactoring the way we do fetch during pinning
The pinner keeps track of long term state in it's internal pin database, it also keep track of state in the blockstore, it would be VERY weird if it kept track of some state but not all of it (if we still had the internal DB but not the blockstore part of it).
Currently it does not even store the blocks directly, it relies on side effect of the blockservice (blockservice stores all block downloaded).
Also unlike https://github.com/ipfs/go-bitswap/pull/571 where there was a bug where blocks were local puts were duplicated (once in blockservice and once in bitswap), I am not aware of any bug here, it's just very side effect heavy code that hard to use without the monolith architecture Kubo was built on.
Maybe there is something better elsewhere but this seems like a complete refactor of the pinner which I don't want to take on right now. I am not convainced that the two usecases here (pinning per requests and pinning on a monolith) are similar enough to warant sharing a package. Except than a traversal logic (which can be abstracted in a new or already existing third package) I don't see what they would have in commun.
The pinner keeps track of long term state in it's internal pin database, it also keep track of state in the blockstore, it would be VERY weird if it kept track of some state but not all of it (if we still had the internal DB but not the blockstore part of it).
Yet that's what is actually happening. Consider those two examples (among other):
block/put
: kubo insert the block into the blockService, then call PinWithMode
, which only record the pin. No check is made, Pinner blindly accept the pin and guarantee nothing.pin/add
: kubo fetch the root block, then call Pin
, which will insert that root node, fetch the rest of the DAG then record the pin. Here, the Pinner does guarantee that the data is there.Also unlike https://github.com/ipfs/go-bitswap/pull/571 where there was a bug where blocks were local puts were duplicated (once in blockservice and once in bitswap), I am not aware of any bug here, it's just very side effect heavy code that hard to use without the monolith architecture Kubo was built on.
If you look at my pin/add
example, I'm pretty sure there is a duplicated Put there. DAGService get the block from the network (side effect: put), then pass it to Pinner which will write it again.
Maybe there is something better elsewhere but this seems like a complete refactor of the pinner which I don't want to take on right now.
I'm only suggesting to eliminate Pin
in favor of PinWithMode
(although Pin is a better name). This function is only used 4 times in kubo (not counting tests), so that should be quite manageable.
We get back to a uniform API, eliminates double write, remove the odd side effect that makes it hard to compose and reuse ...
As for bitswap, I'd like to ask you to do the exercise in reverse: why would Pinner need to write blocks? Why does it needs to fetch a DAG? I don't see better reason than "it was written that way in ancient ages because it was convenient".
Additionally, Add()
is currently racy with the GC, as it doesn't create a PinLock
. This is because the pinner doesn't have access to the blockstore, while the calling side (kubo) does.
Having worked on the Pinner in various way (including alternate implementation), it occurred to me that it really shouldn't be tasked to:
Pin()
: https://github.com/ipfs/go-ipfs-pinner/blob/72f5e02e73db5e05ef0a140a7938cbc89dfc38b0/dspinner/pin.go#L177Pin()
: https://github.com/ipfs/go-ipfs-pinner/blob/72f5e02e73db5e05ef0a140a7938cbc89dfc38b0/dspinner/pin.go#L202This is for different (not obvious, I admit) reasons:
Pin
andPinWithMode
and simplify the APIAdmittedly, things have worked that way for a long time, but fixing that might help in the long run. Looking at where
Pin
is called in kubo, it also looks like a non-invasive change.