lidel commented 2 years ago

@aschmahmann @petar I remember we discussed this a while ago, as a low-hanging fruit for bigger data providers like Pinata, but was unable to find an issue, so created this one.

Improving provider strategies was previously discussed in: https://github.com/ipfs/go-ipfs/issues/6221, https://github.com/ipfs/go-ipfs/issues/5774, https://github.com/ipfs-inactive/package-managers/issues/84. In this issue I want to propose a well-scoped improvement of codec-aware strategy that could be shipped without refactoring the entire system.

TLDR

Add a new (opt-in) strategy: when announcing a big UnixFS directory tree, only announce root blocks of directories and files, and skip all internal file data blocks.
Leverage full content path for finding providers of root blocks.

Problem statement

Right now, we support three values in Reprovider.Strategy which tells reprovider what should be announced. Valid strategies are:

"all" - announce all stored data (this is also the implicit default)
"pinned" - only announce pinned data
"roots" - only announce directly pinned keys and root keys of recursive pins

If the repository gets too big, all and pinned are too expensive and folks are forced to use roots which is codec-agnostic and will only announce the root block of UnixFS DAG.

This means in case of big UnixFS datasets, the user has to write additional orchestration code to go the extra mile and manually pin every file withing a bigger DAG, and make sure those sub-pins are removed when the entire DAG is no longer needed.

Proposed solution: codec-aware (UnixFs) strategy

Depending on a codec, different blocks may have different importance. In case of UnixFS the important blocks are manifest (root) blocks of directories and files. Sub-blocks of individual files with the data itself are not as critical as those manifest blocks. It is CID of manifest block that is looked up on DHT first.

A big data provider may want to opt-in to codec-aware strategy as "best-effort" way to provide something on DHT rather than nothing: in case of UnixFS only provide these manifest blocks on the DHT, facilitating initial lookup without the cost of announcing all the sub-blocks.

Open questions

Is announcing of those UnixFS root blocks enough?
- Depends. After the manifest block of a big file is fetched, the user is already connected to a peer which most likely has the rest of the blocks and transfer can happen over bitswap. But if the transfer gets interrupted and connection is lost, then it is not possible to resume because we already have root block in local store and we only lookup for missing sub-blocks which were not announced on DHT.
- Potential fix would be to do DHT lookup not only for a specific sub-block in a file, but also for the first UnixFS root block above them (either a root of a file, or a parent directory). Rationale being, if someone has the root of a file, they most likely have the rest.
  - We track this in https://github.com/ipfs/kubo/issues/10251

Jorropo commented 2 years ago

Potential fix would be to do DHT lookup not only for a specific sub-block in a file, but also for the first UnixFS root block above them (either a root of a file, or a parent directory). Rationale being, if someone has the root of a file, they most likely have the rest.

This seems reasonable. (I was actually writing this before I've fully red your message.)

The biggest issue with this is that we unofficially create a special case for unixfs files as a single independent entities and that make it harder to create new interessting cross files features in the future.

Two options that I would like to have would break with such thing:

Content based chunking. Let's assume I add a .car archive to IPFS (you might think "that dumb just add the blocks", but no this is meant for extra support, my pinning service doesn't support a fancy DAG format that I want to use), so I make a .car archive of my blocks and chunk it perfectly to the block blobs in the .car using raw leaves then when I want to download it, I use multihash addressed requests (v0.12.0 blockstore update). So the downloader thinks it is downloading a dag-turbo-3000 object, the pinning node thinks it serve dag-unixfs -> raw-leaf, but both agree because in the end their hashes match. However with this, the pinning service would announce what it thinks is the true root (root of the .car) while the downloader would search for the root of the dag-turbo-3000 (which the pinning service does has, just it thinks it's a borring raw-leaf).
Delta adds. We could add a --delta=<CID> option to add (or make it a standalone thing, the details are not important). This would use a chunking strategy that would assume that all blocks in --delta are free and would try to reuse them as much as possible. This would make for cheap incremental updates (note, that would not be that good because we would be limited to blocks, more advanced deltas are capable to pick variable size and arbitrary offsets are far more efficient, but also more expensive to compute and atrocious if you are trying to unthread a very long chain of deltas). Let's assume I download a new version of my app. 90% of the blocks are actually the same as previously, but there is 10% that is new. We can assume that a lot of people already serve the old version, but not much from the new. I would have issue finding nodes serving the old version even tho most of the blocks I can find since they would announce the old root CID and I would search the new. (note I assume the node downloading doesn't already own the original delta cid)

What I would like to see.

I would like to see some priority system. Advertising all CIDs is expensive and only usefull in certain rare scenarios or scenarios that doesn't even exists yet. I think if we could layer strategies that would be nice. So my node would burn full speed at 1200% cpu until all directories and root of files are published which would take a minute hopefully. And then go the a throttled mode at 200% where it will publish all cids in the next 3 hours or so.

lidel commented 11 months ago

For announcement problem:
- IPIP-402 introduced the concept of "entity". We could reuse it here, and have Reprovider.Strategy: entities which only announces the minimal set of blocks required for enumeration. For a file or DAG-CBOR document, that will be a single root blocks. For HAMT-sharded Unixfs directory, it would be the hamt blocks.
For content lookup / resume problem
- Clarify docs until we improve implementation: https://github.com/ipfs/kubo/pull/10249
- Improve implementation, make every "get block" operation aware of the content path affinity: https://github.com/ipfs/kubo/issues/10251

ipfs / kubo

Improved Reprovider.Strategy for entity DAGs (HAMT/UnixFS dirs, big files) #8676

TLDR

Problem statement

Proposed solution: codec-aware (UnixFs) strategy

Open questions

What I would like to see.