libp2p / go-libp2p-pubsub

The PubSub implementation for go-libp2p
https://github.com/libp2p/specs/tree/master/pubsub
Other
309 stars 178 forks source link

question: How to optimize the transmission efficiency of pubsub? #527

Open JacksonRGB opened 1 year ago

JacksonRGB commented 1 year ago

I am currently using pubsub with grpc proxy, and the maximum TPS tested in the same AWS availability zone is 800. When continuously sending data of 20kb, the transmission bandwidth is about 16MB/s, and when continuously sending data of 40kb, the transmission bandwidth is about 32MB/s. The CPU usage is always less than 200% on a 4-core CPU.

The configuration I am using is:

pubsub.WithMessageSignaturePolicy(pubsub.StrictNoSign)
pubsub.WithNoAuthor()
pubsub.WithMessageIdFn(msgID)

Found the reason: the custom msgID function was consuming too much time.

this is my function

func msgID(pmsg *pubsubpb.Message) string {
    h := sha256.Sum256(pmsg.Data)
    return fmt.Sprintf("%x", h[:20])
}

if I do this, I got a test result of 72MB/s (3600TPS).

func msgID(pmsg *pubsubpb.Message) string {
    return fmt.Sprintf("%x", rand.Int63())
}

I have tried many hash functions such as xxhash, and during the benchmark test xxHash was 30 times faster than sha256. However, in actual testing, TPS remained at 800.

BenchmarkMsgID/msgIDSha256-8            22498             53323 ns/op
BenchmarkMsgID/msgIDRandom-8          9200742               128.6 ns/op
BenchmarkMsgID/msgIDxxHASH-8           727844              1650 ns/op

update:

If I read the pubsub.Message, it becomes very slow.(800TPS)

func msgIDxxHash(pmsg *pubsubpb.Message) string {
        // h := xxhash.Sum64(pmsg.Data[:100])
        return fmt.Sprintf("%x", pmsg.Data[:100])
}
lthibault commented 1 year ago

@minchenzz Can you post pprof/trace data?

My immediate question for you is whether you are using Ed25519 for signature verification. In my experience, that's a quick and easy win for performance.

JacksonRGB commented 1 year ago

@lthibault Thanks for your help! I didn't use ED25519. I want to remove duplicates by using msgID.


sha256ID

pprof/trace

trace-sha256.png

pprof/profile

profile_sha256.png


xxhashID

pprof/trace

xxhash-trace.png

pprof/profile

profile_xxhash.png


randomID

pprof/trace

trace-random.png

pprof/profile

profile_random.png

lthibault commented 1 year ago

I don't see anything obvious there. You might also consider running a CPU and memory allocation profile to see if there's anything there.