libp2p / rust-libp2p

The Rust Implementation of the libp2p networking stack.
https://libp2p.io
MIT License
4.46k stars 927 forks source link

protocols/bitswap: Add BitSwap implementation #2632

Open SionoiS opened 2 years ago

SionoiS commented 2 years ago

Hi,

Just wanted to get the ball rolling on this. As discussed in Discord.

Some links from a quick search. https://github.com/rs-ipfs/rust-ipfs/tree/master/bitswap https://github.com/ipfs-rust/libp2p-bitswap https://github.com/ChainSafe/libp2p-bitswap/ https://docs.substrate.io/rustdocs/latest/src/sc_network/bitswap.rs.html https://github.com/n0-computer/iroh/tree/main/iroh-bitswap/src Drop your own below!

My experience with Bitswap is limited. I had to slightly modify rust-ipfs impl. of it for my Testground uses.

Here's some points I find important;

  1. Allow any block providing strategy.
  2. Provide all possible events.
  3. Stats tracking.

rust-ipfs impl. did allow for different strategy but was missing a block sent complete event I needed.

Is there official spec written somewhere? Do we start from scratch or modify one of the existing implementations? Could we not implement GraphSync instead?

Thanks.

mxinden commented 2 years ago

Is there official spec written somewhere?

Spec: https://github.com/ipfs/specs/blob/master/BITSWAP.md https://github.com/ipfs/specs/pull/269 and https://github.com/ipfs/specs/pull/270

Do we start from scratch or modify one of the existing implementations?

I would have to review the existing implementations before making an informed decision here.

Could we not implement GraphSync instead?

If I am not mistaken that would force IPLD as a dependency onto libp2p. Not a blocker, though something to keep in mind. Also, in case I am not mistaken, Bitswap is significantly simpler and thus a good first protocol for exchanging bytes in libp2p.

//CC @aschmahmann as the Bitswap expert.

aschmahmann commented 2 years ago

Is there official spec written somewhere?

269 is the best to use at the moment (270 is for a new version of the Bitswap protocol). Sorry it's not merged into master yet 😬.

Could we not implement GraphSync instead?

Basically I agree with Max here. Bitswap is going to be a simpler protocol to build out, and while many Bitswap consumers end up using IPLD on top of it to create and fetch larger groups of content it's not required as with GraphSync. No reason both couldn't be included down the line though.

Some motivating reasons to implement Bitswap first:


Given the 5 Rust implementations of Bitswap you may want to decide what it is that rust-libp2p wants out of a Bitswap implementation? Is it to unify the existing implementation efforts, to have a different opinion on extensibility or how the client and server implementations prioritize who to fetch data from and serve it to, etc?

As I mentioned on Discord I suspect it'd be quite useful to have a single codebase that handles the basic mechanics of the protocol for you (e.g. processing the protobuf messages) and then people may have different thoughts on how to orchestrate things like sessions or prioritization. However, maybe perhaps the right abstraction might let people share more code here.

Within the go-libp2p-kad-dht code I did a refactor extracting out the protocol message pieces that's been helpful in letting people play around with new client implementations (https://github.com/libp2p/go-libp2p-kad-dht/blob/f0569715b6e50119f425f708ab6673b536725139/pb/protocol_messenger.go#L34).

SionoiS commented 2 years ago

I read the specs.

We are looking at supporting all versions of bitswap, not just v1.3.0 right?

Also, is priority explained anywhere?

Of the 5 it seams that rs-ip/rust-ipfs impl. is the most generic. Although I don't think having list of wants and blocks is appropriate. Tracking who want what and what to send where should be left for the user IMO.

carsonfarmer commented 2 years ago

Oh cool to see this discussion kicked off from the Discord chat! I think from my perspective, "a single codebase for handling the basic mechanics of the protocol for you" is exactly what we would want out of having this protocol. An abstraction that allows a custom client to easily decide what to send to whom in a way similar how the kad protocol exposes this for responding to provider queries would be quite nice. I think the iroh implementation from @dignifiedquire is already leaning in this direction.

mxinden commented 2 years ago

@SionoiS you raised in the last community call that you would be interested in working on this, supported through a grant :tada:. What would be the timeline on your end? When would you have capacity to work on this?

SionoiS commented 2 years ago

@mxinden As soon as my current grant work is done. I'm already ahead of schedule but I'm not sure how long building the website is going to take. The end of this month at the latest I will start and I planned 4 months. So I guess ~ November.

OR if it's ok with PL I could work on both at the same time but it's going to take longer.

In the Mean time, I would be cool to gather what ppl want to see and don't want for Bitswap. If anyone has thoughts please share!

mxinden commented 2 years ago

OR if it's ok with PL I could work on both at the same time but it's going to take longer.

I would prefer doing the two in sequence instead of doing them in parallel. Also gives me more time to figure out the administrative side of things.

SionoiS commented 1 year ago

Ready to start working on this! Looking at the diff between iroh and rust-ipfs implementation of bitswap. where do we want the libp2p implementation, on a rust-ipfs <-> iroh spectrum.

Seams like people don't like bitswap and are building new things. If thing are changing might be good to keep the scope of this project small.

That's the plan for now let me know what you think.

thomaseizinger commented 1 year ago

My 2c are:

We definitely want a custom ConnectionHandler. Those run on a separate task which means we don't as easily block the main event loop of network behaviours.

SionoiS commented 1 year ago

We definitely want a custom ConnectionHandler. Those run on a separate task which means we don't as easily block the main event loop of network behaviours.

So to get decent perf. I need to impl one, ok make sense!

Is perf the only reason or is there things you can't do without an ConnectionHandler impl?

SionoiS commented 1 year ago

Ok I'll recap the meeting let me know if I got anything wrong.

Bitswap is a flawed protocol, no amount of engineering can fix it. A working group has been assembled to build a new protocol.

A test plan could be made for go-bitswap <-> rust-bitswap but what would that accomplish?

A naive spec-compliant rust-bitswap would not interop well with go-bitswap because of it's idiosyncrasies.

A rust-libp2p bitswap implementation that is compatible with go-bitswap would require high maintenance.

None of the options seams adequate...

@mxinden @b5

notes: https://www.notion.so/Rust-Bitswap-Implementation-s-60e4114cdac243ba9a78875a65511134

b5 commented 1 year ago

Bitswap is a flawed protocol, no amount of engineering can fix it. A working group has been assembled to build a new protocol.

dang, when you spell it out like that, sounds rough 😅 . I would add the context that this is one team's view after having written an implementation, but I'm on that team, and do hold this view.

A test plan could be made for go-bitswap <-> rust-bitswap but what would that accomplish?

it would accomplish at least one thing: It would test our team's assumptions & give a repeatable-ish context for analyzing failures.

A naive spec-compliant rust-bitswap would not interop well with go-bitswap because of it's idiosyncrasies.

fully agreed.

A rust-libp2p bitswap implementation that is compatible with go-bitswap would require high maintenance.

fully agreed.

I think the best thing for our community to do is move on, write a new data transfer protocol that has multiple implementations & a proper spec from the start. If folks need go-bitswap interoperability, I'd encourage them to use & contribute to iroh's implementation. We need to maintain this in some capacity for the foreseeable future.

dvc94ch commented 1 year ago

dang, when you spell it out like that, sounds rough sweat_smile . I would add the context that this is one team's view after having written an implementation, but I'm on that team, and do hold this view.

everyone including PL themselves hold the view. back in 2019 I pointed out that the "game theory" that supposedly makes bitswap work is complete garbage [0]. PL like always just ignored/covered up what I expect they already knew, as their efforts to prove anything about bitswap strategies died.

the later version "fixed" by netflix isn't too bad. it turns bitswap into a proper request response protocol instead of shooting packets into the void and hoping for the best.

libp2p-bitswap is pretty decent. does anyone have any specific complaints about it? other than maybe it's maintainence status. after having authored two ipfs implementations I have come to believe that the entire ipfs thing is a missguided idea. I did some experiments with blake3 verified streaming about a year ago [1]. and now I'm working at a company that wants to use something ipfs like, so I finally started writing a usable ipfs alternative (needs to be a feature complete offchain storage solution in 2w) [2].

dvc94ch commented 1 year ago

There are two core things that cause ipfs to become very complicated and cause a lot of needless complexity:

The basic idea in blake-tree is to use a merkle tree to avoid both of these and everything that derives from these decisions. I believe the design is similar to hypercore or bittorrent. It's essentially a p2p blob store with support for storing and authenticating partial blobs. If you need to store a directory, just tar it and if you try to extract a file from the tar it will only download the tar chunks it needs to do so. We get this for free, no need to design a unixfs or similar crap.

dvc94ch commented 1 year ago

so my project has been cancelled for the time being. in a week I built the core datastructures and algorithms, a http interface supporting range queries for video streaming via vlc and a fuse file system to support extracting files from tar archives without having to download the entire archive. still missing is an efficient update system based on zstd for delivering deltas between artifacts [0] and porting the p2p networking from blake-streams/ipfs-embed. If anyone is interested in tackling offchain storage for blockchain, or filesharing like bittorrent, or just had enough of ipfs, I'd recommend at least taking a look at it. While I wrote it in a week, I previously wasted over two years of my life on ipfs. [1]