hsanjuan / ipfs-lite

IPFS-Lite is an embeddable, lightweight IPFS-network peer for IPLD applications
Apache License 2.0
339 stars 58 forks source link

DAGs stored without the "/blocks" prefix #321

Closed vkost closed 10 months ago

vkost commented 10 months ago

Hi, I ran into a weird problem. I am using ipfs-lite in a distributed database project. The nodes in the ipfs network are using ipfs-lite to store and fetch json data structures. Each cluster is isolated in a private libp2p network using a PSK. A MongoDB collection is used as key-value storage.

Things are generally working very smoothly, but recently I noticed a newly spun node is not picking up data from the ipfs network (and it is supposed to replicate whatever the network has). After a week of digging, I found out that some dags are stored without the /blocks prefix, so instead of /blocks/CIQDACE5RYSBN36AFODMQM6JQ7EITZ3GXYC32H5CVR2IM562R3WVTYY I am finding /CIQDACE5RYSBN36AFODMQM6JQ7EITZ3GXYC32H5CVR2IM562R3WVTYY Otherwise, the dag data is ok.

The whole problem seems to originate from a while back (anywhere between 2 and 12 months ago). I only ran into it because accidentally many nodes in one of my networks died, and this was the only node left standing. So this node became an accidental "source of truth" for all content. And the content was there, however ipfs-lite was not able to load it because of the malformed keys. And it was also not able to fetch it via bitswap because no other nodes with content existed at the time. Working in a private libp2p network seems to be quite dangerous in this regard.

To confirm the bug, I checked out nodes on the other networks I have running, and indeed all the older nodes have a number of these malformed keys. Some have a set of correct keys, then a set of malformed keys, then again a set of normal keys. Since I was upgrading and fixing the executable over time, my theory is that at some point my executables were writing dags with faulty keys, and the behavior stopped after an update. But I have no way of knowing exactly when any of this happened.

My question is - would you happen to have any idea on why the keys were malformed? I see that ipfs-lite is actually not implementing dag.Get() and dag.Put() method, and is instead using the default implementation. So whatever happened must have been in a related package, not in ipfs-lite as such. But you're my best starting point :)

I am fully aware the problem is complex and not easy to explain, so if you need any further info, I'll be more than happy to provide it. Thank you in advance.

hsanjuan commented 10 months ago

Hello,

normally, blockstore keys are "wrapped" (namespaced) into the /blocks/ prefix by default: https://github.com/ipfs/boxo/blob/main/blockstore/blockstore.go#L156. I believe this has always happened, and ipfs-lite's blockstore was always setup to read (and write) to /blocks/ because go-ipfs-blockstore (that now lives in boxo) always did that by default.

I am not sure how you wrote blocks without the prefix. Either you wrote directly to the underlying datastore, or you wrote through a blockstore initialized with the "noPrefix" option.

vkost commented 10 months ago

Hi, thank you for the response. One last question before I close this thread: is there any concept of pinning in ipfs-lite, and does it ever remove the blocks from the store under any circumstances?

TL; DR I did see the wrapper producing the prefix, and indeed checking the repo demonstrates that it has not changed for a long while. Your assumption that something else may be pushing the pieces with wrong prefix is completely logical. However, since the entire data interchange happens over ipfs-lite, one wonders, why don't I see "duplicate" entries - with prefix, stored by ipfs-lite, and without prefix, stored by some rogue process. In the actual dataset I used for debugging, I have entries outside of ipfs pointing to specific CIDs, so I know that those CIDs existed on the network at some point. But now the system can't find them because of the prefix problem. As soon as I "fixed" the keys appropriately, all CIDs were fetched by ipfs-lite, no problem at all.

As for the noPrefix option, I was investigating that as well, but my initializers do not use that option, nor they ever were. In fact, my initializers leave almost everything default. I was hoping that maybe at some point there was a default preference for noPrefix, and this caused temporary change of behavior, but I was not able to find any trace in the repos. However, the whole ipfs repo structure is very complex, so I can't be sure I checked every nook and cranny. I should point out that my project still uses old-style repos, because I never had the time to switch to /boxo/.

Still, I very much appreciate your insight insight. I will keep digging for the final answer. But I thought it's worth asking. Vic

hsanjuan commented 10 months ago

is there any concept of pinning in ipfs-lite, and does it ever remove the blocks from the store under any circumstances?

No. But it is easy to wrap ipfs-lite (DAGService) with the pinner https://pkg.go.dev/github.com/ipfs/boxo@v0.16.0/pinning/pinner/dspinner

Removing blocks, or GC-ing would need its own implementation too, I think Kubo's GC code is still in Kubo and not very re-usable.

I cannot think of anything related to ipfs-lite that would put blocks in / instead of /blocks. AFAIK nothing in the ipfs-stack writes directly in the base namespace "/", everything puts some sort of prefix in front.

vkost commented 10 months ago

Thank you for confirming that there's no GC - this means that data will remain in the store unless I remove it for whatever reason. I do not need GC at this time, and I have a pretty good idea on how to implement it if I ever do. Once again - I really appreciate your help. Stay well Vic