ipfs / go-unixfs

Implementation of a unix-like filesystem on top of an ipld merkledag
MIT License
106 stars 53 forks source link

Remove the internal `ProtoNode` from the `Shard` structure #35

Closed schomatis closed 5 years ago

schomatis commented 6 years ago

Background: https://github.com/ipfs/boxo/issues/387.

The Shard object (UnixFS layer) stores the proto-node (DAG layer) from where it was created (NewHamtFromDag). It uses it as a kind of cache (mostly for its links I'm guessing) but its state is not clear at all (to the point that the Node method doesn't even update it) so internal functions can easily misuse it.

I'm not sure what are the ramifications of all this so the first step would be to remove it and replace it with a link slice copied from the original node (we were already copying the entire node in NewHamtFromDag so this won't take a performance hit). This slice would just be a lazy-loading mechanism of the Shards but once the Shard is loaded we would no longer have use for the link which could be niled (we need to check this). I really want to avoid the kind of double manipulation of the shard state like in insertChild where we update both children and nd.Links. From now on children would be the single place of truth to query the shard state. After this change we can evaluate actually having a cache in the form of a proto-node which actually would be updated in Node() to avoid re-marshaling everything every time the underlying DAG proto-node is needed (that is, if we modified a single entry in the HAMT directory only update the respective nodes but not the entire trie).

schomatis commented 6 years ago

/cc @overbool

overbool commented 6 years ago

@schomatis 👌

overbool commented 6 years ago

I really want to avoid the kind of double manipulation of the shard state like in insertChild where we update both children and nd.Links.

@schomatis could u explain it in more detail?

IMO, we must update both children and nd.Links at the same time, otherwise, we can't get the correct node from nd.Links(because the index had changed).

schomatis commented 6 years ago

@schomatis could u explain it in more detail?

My original idea was to have the children slice, and nothing else, to reflect the state of the node, to avoid confusions. Ideally (if performance wasn't a concern) when loading a shard from a DAG node I would load all the links, populate the entire children slice and discard the entire original node (with its links), I would then work on that slice and only reconstruct the node and its links when explicitly requested through the Node()method. I think that would be the most clear approach on how to handle the shard state.

Now back on reality, to avoid impacting performance, instead of loading all the children preemptively I should keep the links and load them one by one only as needed; when I load the shard the link is pointing to I set it to nil to explicitly state: "hey, this link over here is no longer valid, the content it pointed to is now being manipulated in children so the link now points to (potentially) outdated content, don't use it anymore".

(because the index had changed).

If by that you mean that some children may be added/removed and the actual order might not correspond with the link order anymore I'm just now realizing why were the node links manipulated with nil values like in insertChild: to preserve that order. We can keep doing that but I'd rather have a separate bitfield or slice to keep a record of the original positions those links pointed to, that is, you'd have two copies of the original bitfield stored in the node, one would be assigned to the links (it would only decrease in size, that is, we'll only turn 1's to 0's) and the other one would be the normal bitfield used in tandem with the children.

Those two (links and this new bitfield) could be encapsulated in a structure with a single method like getLink(childIndex) called inside loadChild that would access the link with that index and it would then set the bitfield to zero and the link to nil (to make sure we don't access it again, we should have it stored in children from now on).

This would increase (slightly) the consumed memory of a shard but it is still heavily outweighed by the memory reclaimed by not storing an entire node anymore.

overbool commented 6 years ago

Those two (links and this new bitfield) could be encapsulated in a structure with a single method like getLink(childIndex) called inside loadChild that would access the link with that index and it would then set the bitfield to zero and the link to nil (to make sure we don't access it again, we should have it stored in children from now on).

@schomatis IMO, this is good advice. However, if we refactor the hamt code like this way, maybe we need change some code. Because we need name-hash (used by original bitfield) to get original link but some functions like loadChild, getChild, walkTrie will take in a slice index (idx := ds.indexForBitPos(name-hash)) instead of name-hash(we wanted). In this way, we can't get the name-hash inner those functions mentioned above, so we can't get the correct link from original links.

Here is an example:

https://github.com/ipfs/go-unixfs/blob/e8af7a6b5b5588835fb856e35024d9163361c7ab/hamt/hamt.go#L471-L475

getChild maybe will get link from original link, but getChild only takes in the slice index.

Do you have any good ideas for this problem?

schomatis commented 6 years ago

Sorry, I'm having some trouble understanding the problem,

What is name-hash? I mean, what variable/structure are we referencing? Is it the content of the Name of the internal links (links that point to other shards)?

some functions like loadChild, getChild, walkTrie will take in a slice index (idx := ds.indexForBitPos(name-hash)) instead of name-hash(we wanted).

indexForBitPos references the (original) bit-field associated with the children slice which is what we want to use as a reference throughout the code.

In this way, we can't get the name-hash inner those functions mentioned above, so we can't get the correct link from original links.

Why?

The idea is to have two bit-fields, one for children and one for the original links because those two will start having different orders as the shard is being manipulated (and some child nodes are added/removed).

getChild maybe will get link from original link, but getChild only takes in the slice index.

Yes, from now on the children slice index is the source of truth, so that index is actually the child index (should be renamed accordingly), that index will work for both the bit-field associated with the children slice and the bit-field of the original links.

overbool commented 6 years ago

@schomatis Sorry for my vague explanation.

What is name-hash? I mean, what variable/structure are we referencing? Is it the content of the Name of the internal links (links that point to other shards)?

https://github.com/ipfs/go-unixfs/blob/e8af7a6b5b5588835fb856e35024d9163361c7ab/hamt/hamt.go#L378-L385

When I say name-hash, I mean idx (in the code above). That is to say, in cindex := ds.indexForBitPos(idx), idx is what I say name-hash, cindex is what I say slice-index.

If we wanna get original links, we should use name-hash (used by original bitfield) instead of slice-index.

My problem is getChild takes in slice-index instead of name-hash, but maybe we need use name-hash to get origin links when ds.children[slice-index] is nil.

https://github.com/ipfs/go-unixfs/blob/e8af7a6b5b5588835fb856e35024d9163361c7ab/hamt/hamt.go#L281-L296

As shown above, if ds.children[i] is nil, we will call loadChild. However, loadChild takes in slice-index instead of name-hash.

The solution I can think of at the moment is that getChild and loadChild takes in name-hash instead of slice-index. However, I don't know if this is the best way.

schomatis commented 6 years ago

First of all, this back and forth is great feedback on some terminology will need to clarify in the code :)

The missing link you're referring to is the second bit-field discussed above. When you are looking for a nonexistent (nil) child i you'll need to load it from the links slice. That index i can't just be applied to the link slice, you'll need to use a function like indexForBitPos but for this second bit-field that will transform that i into an index for the link slice, that is, instead of calling getChild with cindex

https://github.com/ipfs/go-unixfs/blob/e8af7a6b5b5588835fb856e35024d9163361c7ab/hamt/hamt.go#L379-L383

you should call it with the bit index idx and transform that to the index of the links slice (and not the children slice).

schomatis commented 6 years ago

A note on terminology so we can be on the same page, each shard (backed by a DAG node) can have a maximum of, say, 256 (DefaultShardWidth) children, when storing a new link (either pointing to another shard or an entry in the directory) the hash of its name will tell us in what position to store it. That position is the child index. As a performance improvement, instead of having a slice of 256 positions mostly empty with a few children scattered around (because of the dispersion of the hash function the children won't take position 1,2,3 but rather 56, 143, 210) we represent that sparse array as two objects: a slice with the actual children (no matter what position they actually have) and a bit array that tell us what position each children belongs to (it's cheaper to have 256 bit-field than a 256-size slice). So the first child in that slice (children as named in the code) has a an actual child index corresponding to the position of the first one in the bit-field, e.g., if the first one in the bit-field is in position 45, then the first child in the slice actually has that position (not a child position of 1); and so on.

So the child index corresponds to the bit index when talking about bit-fields but it's stored in a different slice index when actually accessing it. The conversion between child index (also bit index) and the slice index is performed through the bit-field function OnesBefore, e.g.,

https://github.com/ipfs/go-unixfs/blob/e8af7a6b5b5588835fb856e35024d9163361c7ab/hamt/hamt.go#L572-L577

schomatis commented 6 years ago

So, let's group the bit-field and the slice into a structure named sparse array (pretty sure there's a more appropriate name for it), now we'll have two of those sparse arrays, the original child sparse array (children + bitfield in the code) and a new link sparse array containing the array of links and another bit-field to signal what child index those links are referring to. Originally the bit-field (child indexes) of both sparse arrays will be the same because the children are loaded from the links, but as we add/remove children in the children sparse array its bit-field will differ from the one in the link sparse array and we can't use the same index for both. The code should largely only talk about child indexes, how is that child index represented inside a sparse array should be internal to that structure, i.e., there shouldn't be cindex (indexes of the internal slices) lying around.

schomatis commented 6 years ago

Sorry, forgot to clarify, then the getChild problem you're correctly pointing out is that the index it's receiving (not clarified at all) is a slice index that only works for the children sparse array, you can't use it in the link sparse array, that method (and some more I imagine) need to always receive child indexes (aka bit-field indexes) that can be used in both sparse arrays, in the children (if the child already exists) or in the link sparse array (if the child doesn't exist and we need to load it from the links).

overbool commented 6 years ago

@schomatis Thank you for your detailed answer, I see what you mean.

But there's another problem.

https://github.com/ipfs/go-unixfs/blob/e8af7a6b5b5588835fb856e35024d9163361c7ab/hamt/hamt.go#L471-L480

When we wanna traverse all children shard, we will use cindex to get children[cindex]

My solution is to traverse each bit in bitfield, just like Node(), but this may be an inefficient solution.

https://github.com/ipfs/go-unixfs/blob/e8af7a6b5b5588835fb856e35024d9163361c7ab/hamt/hamt.go#L142-L150

schomatis commented 6 years ago

Good point.

My solution is to traverse each bit in bitfield, just like Node():

Ideally I would agree but that may have a performance impact (you'd be iterating 256 positions for every shard on the trie).

In that case we have to load all of the links into children. In a separate function walk the link slice and load it into the children sparse array, effectively emptying the link sparse array and moving it into the children sparse array, we need to have all the shards loaded to walk the trie and apply the cb function. Then you can just walk the children slice as its currently doing.

overbool commented 6 years ago

@schomatis It's a good idea to use a two bitfield(child bitfield and link bitfield) to reduce operations. Otherwise, we must manipulate the child slice and link slice in the meantime when we add child into shard in our original implementation.

However, when I finished implementing this logic, I found that using two bitfield adds a number of bitfield operations. Here is the example. There are 4 basic operations about shard: addChild, rmChild, setChild(update) and traversalAllchildren(Node() or walk() function).

  1. addChild
original implementation:
    1. add children into children slice
    2. add nil into links slice(in order to keep index order with children slice)
    3. set bit in children bitfield

2 bitfield implementation:
     1. add children into children slice
     2. set bit in children bitfield

we can conclude from above explanation that the way of using 2 bitfield doesn't need manipulate links slice.

  1. rmChild
original implementation:
    1. rm children from children slice
    2. rm link from links slice(in order to keep index order with children slice)
    3. unset bit in children bitfield

2 bitfield implementation:
    1. rm children from children slice
    2. rm link from links slice(in order to keep index order with children slice)
    3. unset bit in children bitfield
    4. unset bit in link bitfield
  1. setChild(update)
original implementation:
    1. just set children into children slice(update)

2 bitfield implementation:
     1. set children into children slice(update)
     2. rm this children from link slice
     3. unset bit in link bitfield
  1. traversalAllchildren
original implementation:
    1. get each child through the child slice, if the child is nil we will get link from link slice

2 bitfield implementation:
     1. we have to load all of the link slice into children slice
     2. Then get each child through the child slice

We can conclude from above explanation that the way of using 2 bitfield in setChild and rmChild will need more bitfield operations and traversal all children operation become more complicated.

Therefore, IMO, I think maybe original implementation is cleaner. What we need is to group the bitfield , children slice and link slice into a structure(like the follow code) and provide a uniform operation of functions.

type something struct {
  bitfield bitfield.Bitfield
  children []*Shard
  links []*ipld.Link
}

Here is the example about uniform operation functions.

  1. rm children

    func (s *something) rmChildren(idx int) {
        i := s.bitfield.OnesBefore(idx)
    
       // rm children from children slice
    copy(s.children[i:], s.children[i+1:])
    s.children = s.children[:len(s.children)-1]
    
        // rm link from links slice
        copy(s.links[i:], s.links[i+1:])
    s.links = s.links[:len(s.links)-1]
    
    s.bitfield.UnsetBit(idx)
    } 
  2. add children

    func (s *something) addChildren(idx int) {
        i := s.bitfield.OnesBefore(idx)
    
        // add children into children slice
        s.children = append(s.children[:i], append([]*Shard{sv}, s.children[i:]...)...)
        // add nil into links slice
        s.links = append(s.links[:i], append(nil, s.links[i:]...)...)
    
    s.bitfield.SetBit(idx)
    } 

    WDYT?

schomatis commented 6 years ago

Thanks for the detailed analysis, I need to read this closely, but my answer would be something like:

  1. You don't need to keep the link slice updated because now we're making the child slice the only source of truth, the link slice is only used for loading children the first time, e.g., rmChild: you only remove the child in children because that child position already doesn't exist in the link slice since it was niled when loading the child you're trying to remove in the first place (kind of a lazy loading pattern). Similarly in other functions, the fact that you're manipulating a child in the children slice with, say, index 4 means that the link slice doesn't contain anything there, because index 4 was already consumed during the initial loading of the child that now resides in the index 4 of the child slice (I'm always talking about child indexes here, aka bit-field indexes, not slice indexes).

  2. More importantly (but more opinionated so putting this second) even if we indeed needed those operations I don't count code complexity like this:

2 bitfield implementation:
    1. rm children from children slice
    2. rm link from links slice(in order to keep index order with children slice)
    3. unset bit in children bitfield
    4. unset bit in link bitfield

If we correctly encapsulated the bitfield and the slice in something like the sparseSlice structure (described above) there are only two operations: sparseSlice.remove() for both the sparseSlice variable children and links; so conceptually I only have one kind of operation, a deletion (happening in two different variables) which is much simpler to process for a human reader (because if you already understood how sparseSlice.remove() works seeing it twice doesn't make the code much more complicated). But again, we don't need to keep the link slice updated.

Again, even if I don't agree with your analysis I really really appreciate the time, effort, dedication and attention to detail invested in it :)

schomatis commented 6 years ago

Hey, on second thought, I'm not so sure about point 1, there may be cases where the consumer could add and remove without checking if there was something (getChild) in the first place, and I'm over-complicating things with sparseSlice, especially since Go doesn't have templates and those slices would need to contain different things (links and shards).

Anyway, let's go with your proposed solution, it seems like a good first approach (we would be encapsulating the code that handles the slices and bit-fields) and we can iterate from there, just don't name the structure something :laughing:

overbool commented 6 years ago

@schomatis In 2 bitfield implementation, the reason I keep the link slice updated when we manipulate children slice(rmChild and setChild(update)) is because we need traversal all children item correctly.

When we update or rm children, I think we should rm the corresponding link in links slice. Otherwise, when we load all of the link slice into children slice, we will load repeated link into children slice.(Because we have no idea what link we should load into children).

Anyway, thank you very much for your answer and explanation.

overbool commented 6 years ago

The main reason I think maybe 2 bitfield implementation is not so good, is because traversaling all children gets more complicated.

schomatis commented 6 years ago

Agreed! :+1: