rvagg / js-ipld-hashmap

An associative array Map-type data structure for very large, distributed data sets built on IPLD
Other
23 stars 4 forks source link

Could `.load()` and `.save()` work with `Block`s instead of raw bytes ? #32

Open inverted-capital opened 2 years ago

inverted-capital commented 2 years ago

In the load function and in the save function, bytes are converted to and from multiformats/block objects.

When working with this library, we find ourselves doing this same conversion in order to perform further operations on the data structure. Could the interface be changed to support load and save of Block objects directly, thereby allowing more external usage options ?

Admittedly this effort is to allow us the ability to lightly load the hamt and interact with it, and to calculate the diffs between two large hamt's without fully loading each. Should we perhaps direct our efforts to figuring out how to implement these functions inside your library, or are these two features out of scope ?

By light loading, we mean that we want to ensure that certain keys are already loaded in the hamt so we aren't stuck waiting for IPFS to fetch, and diffing means we set a checkpoint, do a bunch of interactions with the hamt, then write them all out at once, keeping track of what changed to report back the diff. This is the primary usage that would benefit from batching in our use case.

inverted-capital commented 2 years ago

So the interface we're proposing is:

get( cid ) -> Block
put( Block ) 

And the reason is that ipld-hashmap is already opinionated about using ipld primitives, and providing the blocks has more extra uses than raw bytes.