libp2p / js-libp2p

The JavaScript Implementation of libp2p networking stack.
https://libp2p.io
Other
2.31k stars 443 forks source link

Transfer times got twice longer between js-libp2p@0.36.2 and js-libp2p@0.37.3 #1342

Closed alxroyer closed 1 year ago

alxroyer commented 2 years ago

Severity:

Medium (performance issue)

Description:

According to my experimentations, transfer times got twice longer between js-libp2p@0.36.2 and js-libp2p@0.37.3.

If I am true:

Steps to reproduce the error:

See the following test scripts (one for each version of js-libp2p):

Those scripts make it possible to launch 2 nodes:

Timestamps are printed out when the sender starts sending the data, and when the receiver has finished receiving all of it.

With a bit of scripting around it, I made up the following graphs:

p2p901-2022-08-08a Note: Chunks of 100 bytes are not representative in as much as it causes a lot of overhead around the useful data actually transfered (see #1343).

p2p901-2022-08-08b

p2p901-2022-08-08c

As the graphs show it, js-libp2p@0.37.3 transfer times are about twice those of js-libp2p@0.36.2 from chunk sizes of 1Kb.

alxroyer commented 2 years ago

Possibly related to #1303, but not sure #1303 explains all the performance problems.

wemeetagain commented 2 years ago

@Alexis-ROYER having not looked at your case specifically, so I can't say 100% that the problem is #1303 but its very likely most of it is. Other notable regression is protobuf encoders/decoders.

We're actively working on resolving the performance issues, and most of the work has landed in our nightly version. This is our current focus (along with stabilizing the types) before pushing out a v0.38.0.

mpetrunic commented 2 years ago

@Alexis-ROYER Could you try with the latest 0.38 version?

achingbrain commented 2 years ago

Closing as the issue should be resolved - please re-open if you continue to observe it.

alxroyer commented 2 years ago

I've updated my measures with js-libp2p@0.38.0 and js-libp2p@0.39.0: p2p901-2022-09-09b p2p901-2022-09-09c

Transfer times are still about those of js-libp2p@0.37.3, higher than js-libp2p@0.36.2.

The following test script works for both js-libp2p@0.38.0 and js-libp2p@0.39.0: libp2p-node.0.38.x.mjs

alxroyer commented 2 years ago

@achingbrain Could you please re-open the issue? I can't on my side.

achingbrain commented 2 years ago

I've put a test bed together here: https://github.com/ipfs-shipyard/js-libp2p-transfer-performance

It takes a slightly different approach to the two files in the OP - notably it doesn't configure a DHT since it's not needed and uses the plaintext connection encrypter instead of noise as it's simpler.

It transfers 100MiB of data between the two nodes using increasing chunk sizes, recording how long it takes.

Since the thing doing all the work here is the stream multiplexer, it tests 0.36.x, 0.37.x, 0.38.x and 0.39.x with mplex and also 0.39.x with the brand-new yamux multiplexer. 0.38.x and 0.39.x use the same version of @libp2p/mplex so you'd expect their speeds to be comparable.

What I've found largely tallies with @Alexis-ROYER's investigation:

image

In the graph above I've ignored message sizes below 256b as I don't think many protocols would send messages so small but the values follow the general pattern.

0.36.x is definitely faster with small message sizes but that gets less pronounced as they increase in size - performance starts to coalesce at message sizes around 32kb.

We should look at the performance overhead for messages smaller than this to try to bring the gap down between the blue and green lines in the graph above.

Other takeaways are that if you're using 0.37.x you should upgrade to 0.38.x or later and yamux can probably be optimised somewhat (not a criticism, it's still very new).

mpetrunic commented 2 years ago

I will run some.profilling tools today to see what the hot paths are

BigLep commented 2 years ago

2022-09-13 converation: @mpetrunic has done some profiling but libp2p dependencies aren't showing up with clinic.

@Alexis-ROYER: thanks for reporting this as clearly there is an issue. We're curious how you noticed this and if this is having an impact on any applications/uses you have with js-libp2p. This will help us gage where to prioritize this work. And certainly any help digging into the issue is welcome.

alxroyer commented 2 years ago

@BigLep My pleasure.

I was actually investigating on performance issues of an app I'm working on. Among others, I'm tunnelling other protocols inside a libp2p connection between two peers. Since it's a tunnelling process, chunks of data are those of the embedded protocols, which are about 5-10kB.

With some big data to transfer, things got stuck in while with libp2p@0.36.2. When I upgraded with libp2p@0.37.3, things seemed to get more stable, but still slow. In a dichotomic approach I started with qualifying transfer rate capabilities of libp2p out of my own code, which led me to report this issue.

github-actions[bot] commented 2 years ago

Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 7 days.

BigLep commented 2 years ago

I removed the relevant labels so this doesn't get auto closed.

achingbrain commented 2 years ago

What I have found is that between 0.36.x and 0.37.x we have started sending more chunks of data than before, and the chunks are a lot smaller.

In mplex we yield every encoded Uint8Array - in the current release that's one for the header and one (or more) for the data, the theory being if you pump data into the socket opened by @libp2p/tcp as quickly as possible, the OS will figure out the best way to send the data. Turns out it's not quite that clever.

I have refactored this code locally to only ever yield one Uint8Array and the time taken to send 105MB of data in 256b chunks decreases from about 5.5s to about 3s, which is good but 0.36.x is just under 2s so there's still a way to go.

Digging further, 0.36.x sends chunks of data that are ~5xx bytes in one go - this is because the array of messages that are encoded always has 2x members so they are encoded and transmitted at once.

0.37.x+ only ever has 1x member in the same array at the same point - I have a local refactor that concatenates all the buffers together in the same way as 0.36.x but it doesn't make much difference because there's still only ever one message.

The messages are added to this pushableV but only ever one at at time.

I've yet to ascertain why the older version yields 2x messages in one go - in theory the pushable will buffer input if it's not being consumed quickly enough - indeed adding an artificial delay to the newer code in the loop can cause multiple messages to be yielded which does increase performance a bit more but it's not much of a solution.

Once I figure out why the messages don't stack in recent releases the same way they do in 0.36.x it opens up more possibilities - what if we send more than 2x messages at once, could it be even faster?

I need to stop now but I'll keep digging on Monday.

achingbrain commented 1 year ago

Good news, with #1491 and libp2p/js-libp2p-mplex#233 both applied streaming performance is fixed, not only that but it's now faster than libp2p@0.36.x with smaller message sizes for both mplex and yamux. This will be all be in the next release.

image