ipfs / kubo

An IPFS implementation in Go
https://docs.ipfs.tech/how-to/command-line-quick-start/
Other
16.01k stars 3k forks source link

Support for higher latency storage (NFS/SMB mounts) #9885

Open IngwiePhoenix opened 1 year ago

IngwiePhoenix commented 1 year ago

Checklist

Description

I would like to suggest an addition to the storage backend. The reason I come to suggest this, is the unfortunate implosion of my node.

Since I use NFSv4 to mount additional storage to my homeserver, since it's quite limited in storage from the get-go, I also chucked my IPFS repo on it. And after a while, I noticed that it was running into an issue where it either couldn't access the disk usage cache in a JSON file from time to time - and when I added a particularily large file, I probably ended it. The last error mentioned inconsistencies in the LevelDB storage.

This has had me thinking and occupied my mind for quite a while. To me, IPFS is a great way to share files with friends. Unlike NextCloud, I can just drop a file in there, then share a link with the CID - and remove it when done. Simple! But if my repo keeps corrupting because of the FS layer (and not coming back - it still hasn't, I have let it run since that post the entire time), I will have to think of another solution...

This is pretty much what I would like to propose:

Another side-effect of a storage backend with those features would be to possibly build a small SAN and connect several IPFS instances to it, allowing some form of load balancing in a more low-cost situation. A (pretty unrealistic and dumb) setup could be a couple of Raspberry Pis connected to a NAS holding the actual repo, thus splitting tasks between the Pis.

Hope this sparks some ideas and is helpful! :)

Kind regards, Ingwie.

PS.: I have temporarily moved to use my NextCloud instance and since encountered that it has issues generating share links. Fun times...

Jorropo commented 1 year ago

@IngwiePhoenix I have seen many peoples running datastores on services as high latency as S3 in an other region. (performance isn't great but this is an other issue).

Have you tried only remotely mounting the .ipfs/blocks folder ? LevelDB contains some internal book keeping metadata for your node and no IPLD data, it shouldn't grow past a few gigs, I don't think you need to mount it through NFS.

IngwiePhoenix commented 1 year ago

Would symlinking work?

i.e.: /mnt/diskstation/bunker/Services/ipfs/blocks -> new_ipfs_repo/blocks?

And as for my node - I think LevelDB is why it's dead. Very unfortunate, since I had set it up with a lot of data... oh well.

Jorropo commented 1 year ago

Would symlinking work?

Try it, I would guess so but I never checked.

And as for my node - I think LevelDB is why it's dead. Very unfortunate, since I had set it up with a lot of data... oh well.

LevelDB doesn't contain any data, the most valuable thing it contains is your pin list (and your MFS root if you use MFS).

You can (while the node is stopped), save your ipfs/blocks folder and your ipfs/config, remove your ipfs folder, make a new one with ipfs init, move your config and ipfs/blocks back in. Then you can start the node (without garbage collection enabled) and re ipfs pin add everything you had. It will reread everything from ipfs/blocks and only download from internet if some blocks has gone missing somehow.

IngwiePhoenix commented 1 year ago

Oh. Well, then I am pretty done for; I used MFS because I liked the idea of working with a pseudo-FS, much cleaner than working with raw CIDs, especially when trying to re-find older things. I'll just make a backup of the whole thing and see if your idea works out!

But, yeah. This situation is why I'd love to see some enhancements to the storage backend. I.e., allow multi-path setup (store blocks here and leveldb there) or possibly allow stricter fs syncing. :)

IngwiePhoenix commented 1 year ago

I spotted a documentation mismatch:

This can be changed manually, however, if you make any changes that require a different on-disk structure, you will need to run the ipfs-ds-convert tool to migrate data into the new structures.

found here: https://github.com/ipfs/kubo/blob/master/docs/config.md#datastorespec

That tool seems outdated, thus the instruction is inaccurate. I am trying to move to badgerds to see if this helps at all - although that's hardly a promise...

I also found out that there is logging control via the environment variable; however, I can't tell which one of the various components have to do with the storage backend - so I can't turn on debug logs for that in particular to see if it shows any complaints.

I am also looking into Kubo plugins... Whilst I am nowhere near an experienced Go developer, I have an idea to address the storage situation; more specifically, by wrapping a regular (flatfs, levelds, ...) into RClone. Since it, too, is written in Go, it'd make for a good target to implement a go-ds-rclone plugin ontop of. But I haven't found the full specification for the datastore interface. Could you link it, perhaps?

Thanks!

IngwiePhoenix commented 1 year ago

@Jorropo

You can (while the node is stopped), save your ipfs/blocks folder and your ipfs/config, remove your ipfs folder, make a new one with ipfs init, move your config and ipfs/blocks back in.

I tried that, but it keeps telling me that my storage config is off and does not fit my config versus the one on disk.

How do I make it aware of my changed (levelds to badgerds) config?