Closed djdv closed 4 years ago
@djdv something that would be useful to think about: are there any parts of unixfsv2 that would be required for this work, or that would be super helpful to have?
Based on the conversation we had in standup yesterday I'm thinking not but there was discussion in the IPLD team about a unixfsv1.5 that could potentially ship much sooner if there was a real short-term need for it and it'd be good to get those needs discovered sooner rather than later.
@andrew One scenario that comes to mind is metadata.
The UFSv1 metadata facilities are limiting (type and size only?), so the approach I have in mind is to just sidestep them entirely. Instead, storing metadata for paths in node-local storage, in a way that is particular to one of the APIs that will come out of this. Effectively separating the local metadata from the network-global data, and using them in tendem during operations that rely on them. (most likely in some overlay-like fashion)
i.e. imagine having an atime
update for path /ipfs/QmWhatever
. The metadata associated with that path would be constructed|modified and flushed to the node's datastore (or elsewhere). This state could be restored later, giving you effective metadata persistence, even on paths that reference immutable data objects.
(contrived example)
Since this separation exists, it would have to be synchronized out of band in some manner, one we talk about often is rsync
, but you could imagine handling this internally as well through one of our dynamic channels (pubsub, ipns, just another dag, etc.).
I've also heard people talking about the topic of storing dag creation arguments within objects somewhere. Storing things like what chunking arguments where used, tree style, etc. Which would be useful to know. With that information decision making during writes, etc. could be more dynamic. Not using whatever is deemed standard, but using whatever makes sense given the extra data. i.e. don't take a trickle dag and turn it into a balanced/hybrid one when appending,
I'm not sure what the state of metadata is in UFSv2. But having standards around metadata would be useful to know, even if I plan to separate them in the short term. It would be nice if mount-specific metadata and UFSv2 metadata structures had some kind of compatibility.
Reading through various repos, I found some of these remarks interesting.
We should probably coordinate around it. Figure out what our hard requirements are, and if they are deeply tied to UFSv2. It may be that the v2 design influences the v1 workarounds, or vice versa.
Randomly cc'ing @warpfork ^These words might make some sense out of context. If so, any input?
I don't have a ton of brain cycles for Unixfsv2 at present, but in case it's useful, here are some links to a particular exploratory romp on the subject:
So, if anyone would like to run with that train of thought by using some schema DSLs to make concrete proposals about metadata -- and several, viably cohabitant schemas that we could pattern-match on in the wild -- that'd be interesting, and I'd try to make a point to be around to chat about it.
If we wish to support file system based package managers, we'll need to accommodate a range of existing expectations, and hopefully exceed them. In general, there should be an easy (and ideally transparent/drop-in) way to both add package repositories into IPFS, and get packages out. From the perspective of both maintainers and users.
Getting to that point will require work in multiple areas to cover multiple use cases. Sections of the work done here: https://github.com/ipfs/package-managers/issues/67 will likely become relevant, as we find out what specific issues are present today, and how we may alleviate them.
For example, using data on IPFS with existing tools will certainly involve mounting IPFS and interfacing with APIs like FUSE. Creating new tools, such as a repo syncing tool, IPFS package manager PoC, etc. on top of existing APIs, may prove challenging for certain workloads, or at the very least, contain a lot of overlap. So we should find ways to improve, or supersede these APIs. In the cases where existing things are fit for certain tasks, it may still be hard to make them work together. So we should find better ways to interoperate.
Short term, we can collect and discuss known problem points that are relevant to package management. What exists today, and why it is/is not viable? e.g.
ipfs files ls
go get obviated byipfs ls
; (at an API level as well)ipfs mount
; for interacting with IPFS using existing utilities...
Long term, we can build towards a virtual file system interface (VFS), that supersedes MFS and better integrates with other APIs that may be relevant for filesystem based operations (such as UnixfsV2).
In between, I plan to work towards creating and maintaining an experimental VFS API that provides a means of interacting with IPFS (using new methods, and our newer APIs). As well as using this API to provide an experimental version of
ipfs mount
that should aid us in testing and development. Particularly, this should help nail down a common set of file system expectations that IPFS implementations provide to developers and users.This draft is a good starting reference: https://github.com/ipfs/go-ipfs/pull/6036 recapping work that's already been done on this effort. Recent talk around this was done at IPFS camp, notes are here: https://github.com/ipfs/camp/blob/master/DEEP_DIVES/31-mounting-an-ipfs-filesystem.md
As we proceed, these goals will have to be better divided up and defined. Expect smaller issues to pop up around this, in this repo, and linked back here as work is done on them. (in a simillar fashion to: https://github.com/ipfs/go-ipfs/issues/5003)