ipfs / kubo

An IPFS implementation in Go
https://docs.ipfs.tech/how-to/command-line-quick-start/
Other
16.16k stars 3.01k forks source link

MFS: how are directory structures formatted inside nodes? #5081

Open schomatis opened 6 years ago

schomatis commented 6 years ago

This issues is a complement to https://github.com/ipfs/go-ipfs/issues/5059.

Within the Adder.addFile function, in the case of a regular file, there are two functions (with highly confusing names) that dominate the path flow,

https://github.com/ipfs/go-ipfs/blob/7853e53860805e08a212d78c4baa5d59bff99ba8/core/coreunix/add.go#L475-L481

The first one, add(), is in charge of taking the file, splitting it into chunks and organizing those chunks in a DAG (with a trickle or balanced layout), returning the root node (dagnode). After that addNode() inserts that node (and hence the file) into the MFS file system (the one the user can query through the ipfs files command set). The MFS (Mutable File System) has an empty specification so I have to rely purely on the code to try to understand more of it.

The addNode function obtains the MFS Root (its function is not yet very clear, see https://github.com/ipfs/go-ipfs/issues/5066) and creates the necessary directory structure for the path,

https://github.com/ipfs/go-ipfs/blob/7853e53860805e08a212d78c4baa5d59bff99ba8/core/coreunix/add.go#L381-L397

Assuming the directory structure exists (we'll come back later to this point), PutNode will insert the (UnixFS) file into it.

https://github.com/ipfs/go-ipfs/blob/7853e53860805e08a212d78c4baa5d59bff99ba8/mfs/ops.go#L86-L99

It will first look (lookupDir()) for the Directory that will contain the file and add it as a child with AddChild(),

https://github.com/ipfs/go-ipfs/blob/7853e53860805e08a212d78c4baa5d59bff99ba8/mfs/dir.go#L351-L373

This function will (after verifying the file doesn't yet exist) add the root DAG node (that represents the file) to the blockstore (with dserv.Add()) and then insert the node into the DAG that represents the filesystem in the function dirbuilder.AddChild(). It has to be noted that this functionality is not in charge of the mfs.Directory (that we've been studying so far) but its dirbuilder member which is a unixfs.io.Directory (the difference between those structures with the same name and the same function name should be discussed in another issue).

https://github.com/ipfs/go-ipfs/blob/7853e53860805e08a212d78c4baa5d59bff99ba8/unixfs/io/dirbuilder.go#L101-L107

In the base case (with HAMT disabled) this just means adding a link to the ProtoNode that stores the UnixFS directory pointing to the root node of the DAG that represents the file (that stores a UnixFS file) with the link name corresponding to that of the file.

So, the MFS can be seen as a big DAG, nodes with UnixFS directories may have children (links) to other UnixFS directories or UnixFS files. The subgraph of all UnixFS directories make up the filesystem structure. The UnixFS files pointed by directories are actually just the root of a DAG of many UnixFS files that make just one file stored, which in turn can be seen as another subgraph of the MFS (and there is one of these subgraphs for every file). I think the MFS itself can be seen as the subgraph with all UnixFS directories plus the root UnixFS files linked by those directories (deeper UnixFS file nodes don't seem to enter into the MFS logic, just the UnixFS one).

To understand more of the directory structure let's go back to the lookupDir function (in PutNode()) which in turn calls the apparently more general Lookup functions that would seem to look for any file, not just directories, and in turn calls DirLookup which is extremely confusing (should raise a separate issue for that),

https://github.com/ipfs/go-ipfs/blob/7853e53860805e08a212d78c4baa5d59bff99ba8/mfs/ops.go#L185-L210

This function seems to iteratively traverse the DAG UnixFS directory subgraph according to the path given looking at the link names of the DAG nodes (which it expects to represent UnixFS directories).

thecipherBlock commented 5 years ago

Hi, @schomatis Is there a way to update a folder 'A' [that was created using ipfs.files.mkdir()] without using ipfs.files.write() ?

i mean using ipfs.add( { path:<folder_path>, content:<Buffer> } )