libp2p / go-libp2p-pubsub

The PubSub implementation for go-libp2p
https://github.com/libp2p/specs/tree/master/pubsub
Other
309 stars 178 forks source link

panic: putting same entry twice not supported #521

Closed AskAlexSharov closed 1 year ago

AskAlexSharov commented 1 year ago
Screenshot 2023-02-20 at 13 57 54
vyzo commented 1 year ago

Ouch, seems our newly incorporated cache has a bug.

Wondertan commented 1 year ago

Our nodes are crashing because of this, and we need this fixed ASAP. If no one is working on this, we can take it,

vyzo commented 1 year ago

sure, go ahead.

Correct behaviour is to ignore the message.

Wondertan commented 1 year ago

The Has on cache is checked here with correct synchronization. However, false from Has can be returned even if the message already in the cache, because the message is outdated(time since first seen > TTL)

Wondertan commented 1 year ago

We can either:

Wondertan commented 1 year ago

I am fine with both, but my intuition inclines toward the second option. WDYT, @vyzo

vyzo commented 1 year ago

just ignore the message i would say, panic is very bad.

vyzo commented 1 year ago

dont do the sweep in Has, will need xlock.

vyzo commented 1 year ago

Also, we need to implement background sweeping at some point, for both cache impls, as this eager sweep business has implications for locking.

Wondertan commented 1 year ago

There are many places where processLoop is locked unnecessarily :) Network.Connectedness can look it for a while

Wondertan commented 1 year ago

dont do the sweep in Has, will need xlock.

I know, but it seemed fine. Though background sweeping is a better approach so, let's just remove panic for now, as you've said

vyzo commented 1 year ago

yeah, although we try to make it fast.

But point taken, ok, lets fix Has.

But please do remove the panic, it needs to ignore the message (and maybe log something in debug).

vyzo commented 1 year ago

We are racing ;)

To summarize: Lets fix Has while at it. If you feel so inclined, feel free to do bg sweeping.

Other than that, the panic must go.

vyzo commented 1 year ago

More generally, if we are using bg sweeping, we can greatly simplify everything.

We dont need queues, just a map of mid to expiry. Has shouldnt even check time, just map presence. And bg sweeping just clears everything expired.

We can and should do for both implementations.

Wondertan commented 1 year ago

Ok, bg sweeping can be done in a separate PR. Let's remove the panic as a start: https://github.com/libp2p/go-libp2p-pubsub/pull/522.

Wondertan commented 1 year ago

@vyzo, mind releasing a patch?

vyzo commented 1 year ago

We certainly can, but i was thinking of doing bg sweeping first (unless you want to do it).

Wondertan commented 1 year ago

Fair, no need to keep that tech debt around.