Open schomatis opened 6 years ago
@overbool you up for some real challenge? :grin: I may throw some tasks at you to help me understand what's going on there.
@schomatis 👌, I am willing to accept the challenge, come on.
Is this experimental? I see UseHAMTSharding = false and sourcegraph can't find any references to the variable outside the project...
@zot I don't know how good sourcegraph is at finding things, but here are a few references https://grep.app/search?q=UseHAMTSharding
HAMT Sharding is experimental within go-ipfs https://github.com/ipfs/go-ipfs/blob/master/docs/experimental-features.md#directory-sharding--hamt
HATM description
Brief description of what we have at the moment.
The Hash Array Mapped Trie (HAMT, superficially explained in https://en.wikipedia.org/wiki/Hash_array_mapped_trie) is at its heart a tree of DAG nodes that together represent a directory. Instead of storing all of the entries of a directory as DAG links in a single node (
BasicDirectory
implementation) theHAMTDirectory
distributes the entries across the nodes of the tree.The entries are indexed based on the hash of its name (to allow for an even distribution), this hash shouldn't be confused with the hash inside the CID that addresses the contents of the entry (node) we're pointing to, the
HAMTDirectory
cares only about the names of the links. Two entries could point to the same content (file) with different names and they would be saved in different positions in this tree.This tree is actually a prefix tree (trie) so each node has only a part of the hash of the entry name, all of its children share the same prefix (sub-hash) of the parent node, with each children having a different hash continuation to split different entries.
From the DAG layer perspective a big graph representing a directory and the files it contains can actually be divided in first a sub-graph (starting at the root) that represent the directory and from that other subgraphs representing the files it points to.
Implementation
Notes about the implementation (focusing on the more obscure points that need to be refactored).
In a normal trie we could have values at any node in the tree (besides the root). In this trie only leaf nodes have values since we're using a hash function (of fixed length) to index them so an internal node with a value would mean an entry with a shorter hash (the node's prefix) which is not possible. In this particular implementation the leaf nodes (
shardValue
) are encoded inside the parent to avoid making an extra request at the DAG level (improving performance). The result is that a normal node (Shard
) can have many values (entries in the directory) since it's encoding many leaf nodes internally. This is all hidden behind thechild
interface which obscures the code considerably, as a result we have aLabel
method only implemented for theshardValue
(theShard
returns an empty string) and theLink
method being used for very different purposes in theShard
andshardValue
(one is the link to another node in theHAMTDirectory
and the other is actually the value stored in the leaf node which extends beyond the directory).Another ramification of this design decision is that to distinguish at the DAG level which link points to another
HAMTDirectory
node or which points to a value (an entry in the directory) the names of the links are overloaded with a hexadecimal string prefix (corresponding to the new part of the hash prefix that is added in the new edge of the trie). Themaxpadlen
attribute encodes the length of that string prefix in the tree and names bigger than that are links pointing to entries (where the actual name of the entry is the sub-string after the hash prefix, e.g.,A3file
withmaxpadlen
of 2) and links with names equal tomaxpadlen
just point to another internal node in the directory (that will have a prefix that includes this added sub-hash). Example: https://github.com/ipfs/go-unixfs/blob/master/hamt/hamt.go#L278. The result is theShard
holding an attribute (prefixPadStr
) which just encodes a string used for aprintf
call which is not easy to interpret correctly (https://github.com/ipfs/go-unixfs/blob/master/hamt/hamt.go#L87).Since this is already encoded in existing sharded directories already deployed I don't think that it will be possible to be modified (extracting this information from the DAG into the UnixFS layer) but should be clearly documented and encapsulated in a separate function.
There are two layers interacting here, DAG and UnixFS. When a directory is being manipulated (adding/removing entries) the information is encoded in the UnixFS layer (through the
children
attribute) and only flushed to the DAG layer (the node containing all this information) whenNode()
is called, but at times there is a decoupling which is to be expected but not clearly documented, and without a clear API of how to interact with theShard
in that state, e.g., how to enumerate its entries: should the links of the DAG node be checked or thechildren
slice? At times there is only one of those that have the accurate information, e.g., when loading aShard
from a DAG nodechildren
is empty (the DAG links should be checked) and when modifying theShard
the information is encoded in thechildren
but not passed to the DAG node. As an example of what would be needed see theFSNodeOverDag
which defines a specific API to interact with a mutable object between the two layers.The internal
ProtoNode
of theShard
(nd
) isn't really used except as a link slice so if the interaction previously described is clearly defined (and correctly decoupled) there is no need to keep this cache node (which isn't even updated whenNode
is called), only the links need to be kept structuring them as just a performance improvement: instead of loading all the child nodes immediately we keep the links and request them on an as-needed basis (but they shouldn't be considered to hold the actual state of the node). This may not be possible since the new hash part is encoded in the link name (instead of inside theShard
object of the UnixFS layer where it belongs).HAMTDirectory
".===
Sub issues: