libp2p / go-libp2p-pubsub

The PubSub implementation for go-libp2p
https://github.com/libp2p/specs/tree/master/pubsub
Other
323 stars 187 forks source link

Publish call high latency #494

Open aratz-lasa opened 2 years ago

aratz-lasa commented 2 years ago

Hello,

The Publish call, in my application, is taking in average 5ms to complete. Moreover, it makes use of an internal lock. Therefore, if you execute Publish N times in parallel, it takes 5ms+N to complete all of them.

Taking this into account, I have a question: What is the lock used for, apart for accessing the t.closed bool? Is there a way to get rid of it? Right now the Publish call it is a huge bottleneck.

vyzo commented 2 years ago

That's unfortunate.

Are you using the topic publish method or the pubsub level Publish method?

Can you profile this? Also, can you check with 0.6/0.7? Lets see if there is a regression.

aratz-lasa commented 2 years ago

First of all, thank you for your fast response!

I have profiled it and the most time takes it at signing messages. If I disable signing it reduces from 5ms to 0.3ms. So I guess it's not bad. However, I do want to sign the messages. Moreover, there is no really a regression from 0.6 to 0.7.

However, I would like to know, what is the lock needed for?

vyzo commented 2 years ago

Are you using RSA keys for some reason?

Can you try with ed25529 keys? It should be considerably faster.

vyzo commented 2 years ago

The lock is used to prevent data races, maybe we can get rid of it indeed.

aratz-lasa commented 2 years ago

I checked and we do use RSA, because when you create a new Host in libp2p, if you don't specify the Identity option it fallbacks into RSA. So I will change that, thanks!

On the other hand, what is the data race that we prevent using the lock? I checked and I could not find any, apart from the closed boolean, which can be replaced by a uber atomic.bool.

vyzo commented 2 years ago

Or a CAS with sync/atomic; care for a patch?

aratz-lasa commented 2 years ago

Okay, patch done. I used the uber package since internally makes use of uint32 and the sync.atomic operations