ipfs / notes

IPFS Collaborative Notebook for Research
MIT License
400 stars 33 forks source link

Filesystem backed ipfs-watch #434

Open Stebalien opened 4 years ago

Stebalien commented 4 years ago

Proposal: a tool for watching a folder and making it available over over IPFS (a) without having to re-add everything any time a file changes and (b) without having to duplicate the file data on disk.

Users currently use the "filestore" feature to add files to go-ipfs without storing the data on disk twice. Unfortunately, the filestore doesn't integrate very well with go-ipfs as-is because go-ipfs expects pinned data to remain available, while files on disk can change. Furthermore, re-syncing a large directory into go-ipfs can be quite expensive.

IMO, the best solution would be to not use the go-ipfs daemon, but instead is to implement an ipfs-watch tool. It would:

  1. Monitor a directory for changes.
  2. When a file is added, it would chunk, hash, and index the file into a database (e.g., sqlite). Then it would use MFS (go-mfs) to add the file to an IPFS directory.
  3. When a file is removed/changed, it would remove references to the file, and remove the file from the IPFS directory structure using MFS.
  4. Finally, whenever the IPFS directory structure changes, the resulting root hash would be (a) printed on standard out and (b) published to IPNS.

The database schema would be:

Events:

Prior art and related:

markg85 commented 3 years ago

I like this idea! It makes it possible to have a folder "synced" on IPFS.

But.. I don't really understand the part where it makes it available on IPFS. You say "use MFS (go-mfs) to add the file to an IPFS directory". How does that magically work? Where's the glue that makes it available on IPFS?

Also, why is there a need for a database in this logic? IPFS internally stores data, can't that be (ab)used to store this too?

Edit. What you propose is - on linux at least - conceptually not that difficult. I do this very same logic with inotify where i'm watching one folder for changes and index those files in a SQLite database. It allows me to say things like "hey google, ask to play " :)

TheDiscordian commented 3 years ago

But.. I don't really understand the part where it makes it available on IPFS. You say "use MFS (go-mfs) to add the file to an IPFS directory". How does that magically work? Where's the glue that makes it available on IPFS?

I believe the idea is something similar to what ipfs-sync does, just without using the HTTP API. If you add a directory like /home/user/Documents/MyIPFSWebsite it's mirrored into MFS as /ipfs-sync/MyIPFSWebsite. On MFS, it's not magic, MFS works sorta like a pin, so the node makes the data available like it would any other data.