Handle blob objects larger than MessageSizeMax

ghost commented 6 years ago

go-libp2p-net.MessageSizeMax puts an upper limit of ~4 MiB on the size of messages on a libp2p protocol stream: https://github.com/libp2p/go-libp2p-net/blob/70a8d93f2d8c33b5c1a5f6cc4d2aea21663a264c/interface.go#L20

That means Bitswap will refuse to transfer blocks that are bigger than (1 << 22) - bitswapMsgHeaderLength, while locally these blocks are usable just fine. In unixfs that limit is fine, because we apply chunking. In ipld-git however, we can't apply chunking because we must retain the object's original hash. It's quite common to have files larger than 4 MiB in a Git repository, so we should come with a way forward pretty soon.

Here's three options:

Leave it as is. Very unsatisfactory.
Make MessageSizeMax configurable. Better, but still far from satisfactory.
Make Bitswap capable of message fragmentation. The size limit exists mainly to prevent memory exhaustion due to reading big messages and not being able to verify and store them as we go. We could teach Bitswap how to verify and temporarily store fragmented messages. This would end up overly complex though, since these fragments are not ipld blocks, and thus can't reuse the stuff we already have.
Introduce some kind of "virtual" blocks, which look similarly to our existing chunking data structuers, but whose hash is derived from the concatenated contents of its children. This is of course hacky because we can't verify the virtual block until we have fetched all children, but it lets us do 3) while reusing IPLD and the repo, and we can verify the children as we go.

Related issues: ipfs/go-ipfs#4473 ipfs/go-ipfs#3155 (and slightly less related ipfs/go-ipfs#4280 ipfs/go-ipfs#4378)

Stebalien commented 6 years ago

@lgierth: also https://github.com/ipfs/go-ipld-git/issues/14

Unfortunately, the issue is security. I wrote up the issue and some potential solutions here: https://discuss.ipfs.io/t/git-on-ipfs-links-and-references/730/4

magik6k commented 6 years ago

This is something the ipns helper from git-remote-ipld will attempt to solve by mapping large git objects to ipfs files. (so a form of 4).

This has 3 disadvantages:

We need to trust whoever built the map until we download the blocks ourselves (not that big of an issue as we will still re-hash to ipld-git, and git repos tend to come from trusted sources, but this should still be noted)
You won't be able to directly pin the tree as it will hang on the large objects (at least with current pinning logic)
You rely on the helper and the repo structure staying compatible for this to work.

It would be great to have some standard way to feed ipfs a map 'CID <-> CID' saying 'I trust this mapping for fetching large objects'

See https://github.com/magik6k/git-remote-ipld/issues/4

ghost commented 6 years ago

Unfortunately, the issue is security

I know, security, but what's the threat model? Let's unwrap this a little more before giving it a wontfix stamp. There's two things I can think of:

Unspecified behaviour via invalid messages
- Solved by: libp2p leaves message validation up to protocols on top.
- libp2p-kad-dht has different validations for its various message types.
- Bitswap validates the received blocks match the expected hash.
- Identify and peer routing are softer in their validation, they put a lower or higher TTL on peer addresses, depending on their connectivity and how they were discovered.
Resource exhaustion / DoS
- Solved by: MessageSizeMax
- Because message validation is left to the upper protocols, libp2p has to assume the worst case for validation: that the whole message has to be received before starting to validate.
- That means we have a very direct tradeoff with memory and concurrency of fetch/validate/store. The larger the messages we accept, the more memory we need, and the longer we have to wait between spending CPU time on validation and network bandwidth on fetching (and IO time on storing).

Now what we want to do here is a pretty bitswap-specific thing, we want to send/receive blocks that are larger than MessageSizeMax. Everything else in IPFS works fine with blocks larger than it. So ideally a solution would not spill over into other parts of the stack.

Having slept over it, I think Bitswap could also transparently do a mix of option 3 and 4. It could fragment large blocks into smaller children and a parent on the fly, and reassemble the original block once parent and children have been received and validated.

This has the slight disadvantage of ~double disk space usage, unless we teach the blockstore how to concatenate multiple blocks into one. (Which is feels okay to me.) This fragmentation is basically posthoc chunking, and given that the fragments are valid blocks, we can fetch them from multiple nodes in parallel. It's block fragmentation using IPFS and IPLD as building blocks.

The only thing we'd add to the data model is a new IPLD format which has different rules for hashing.

A thing that feels weird but is probably okay is that these children blocks can still be bogus data. This is true, but we still make sure they are valid blocks. Even with non-fragmented blocks, you can always receive bogus data, and bitswap rightfully doesn't care as long as it's a valid block. A valid block means we can store and garbage-collect it.

This is something the ipns helper from git-remote-ipld will attempt to solve by mapping large git objects to ipfs files. (so a form of 4).

Doing some kind of web-of-trust there is a huge can of worms to open. This is git, it should be as simple as possible to build on top for others. Just data structures.

The UX with this solution will suffer badly, since solving this on the application layer also means it doesn't get solved in other applications, let alone other layers. I want to fetch large git blobs ("large" is an exaggeration really) through the gateway/CLI/API, link them into other data structures, transfer them, pin them, etc., all while retaining the original hash.

I think this problem is worthy of a generalized solution - the exact same problem exists for ipld-torrent, and I'm sure we'll also see blockchain transaction bigger than 2 MiB. And we haven't even integrated with that many content-addressed systems yet.

Stebalien commented 6 years ago

Libp2p and go-libp2p-net.MessageSizeMax are really only tangentially related. Libp2p streams are byte streams, not message streams and can transmit arbitrary bytes. If you try to send more than MessageSizeMax at once, your buffer will likely be split up into multiple messages by the underlying transports and reassembled on the other side. Bitswap happens to use this constant but should probably use a different MaxBitswapBlockSize constant (and MessageSizeMax should be renamed to DefaultMaxMessageSize or something like that).

The issue here is that I could ask some peer for a file that matches hash X and that peer could keep on sending me data claiming that the file will match the hash eventually. That's where all the discussion on the other thread concerning "trusted peers" (or even snarks) comes in. Even if I write the data directly to disk (chunked or not), I still need a way to verify it before I fill up my disk.

Please read at least some of the discussion on the thread I linked.

dvc94ch commented 5 years ago

Can't this issue be solved by including the size of the hashed thing in the cid as an additional parameter? This would be used as an upper bound on the amount of data that needs to be downloaded to verify that it's the correct data. A user can then configure the maximum block size or with the cli a warning can be emitted if the size is over a certain threshold.

Stebalien commented 5 years ago

It could be. I'm hesitant to do that as we'd need to bump the CID version, again, but that is a decent solution.

magik6k commented 5 years ago

Git objects don't store object sizes with references, so for internal nodes there wouldn't be any good source of that information

dvc94ch commented 5 years ago

How about keeping the default upper bound and allow passing a different upper bound as an option in the api. Then git-remote-ipld can use some out of band mechanism to communicate the size of large git objects. As I understand it it needs that anyway to resolve branch names to sha1 hashes.

That would be option 2 of the initial suggestions. Or @magik6k solution of mapping sha1 to cid for large objects is just as good.

Ericson2314 commented 4 years ago

I am interested in this too. I like taking advantage of that streaming nature of sha-1.

Say we prefix the chunk with the hash of it's predecessor? (i.e. the hash state prior to hashing the current block.) This is like saying a large chunk with a streaming hash function can be viewed as a Merkle tree with the shape of a singly linked list.

What's the security problem? Yes, the previous hash state provided with the block could be bogus, but that issue already exists with regular Merkle intermediate node, right? One is always free to create bogus dangling references to get the hash they want, and then solve for that bogus reference, right?

edit I guess there is a qualitative difference in solving with a fixed vs free initial state, even if the attacker has complete freedom over the message.

On the other hand, it is quite useful that sha-1 appends the length at the end of the message stream. That means any attacker needs to commit to a length up front, and cannot waste the target's resources indefinitely.

edit 2 I elaborated more on this in https://discuss.ipfs.io/t/git-on-ipfs-links-and-references/730/24. Despite the fact that spoofing data and a previous hash state could be much easier, I think it is worth it just for git+sha-1 given this format's ubiquity----we can still ban the general construction for other hash algorithms that are actually secure.

Ericson2314 commented 3 years ago

See https://github.com/protocol/beyond-bitswap/pull/29 and https://github.com/protocol/beyond-bitswap/pull/30 for some progress designing fixes to this.

ipfs / go-ipld-git

Handle blob objects larger than MessageSizeMax #18