libp2p / go-libp2p-pubsub

The PubSub implementation for go-libp2p
https://github.com/libp2p/specs/tree/master/pubsub
Other
313 stars 179 forks source link

encounter concurrent map writes issue #471

Closed wanmingchao001 closed 2 years ago

wanmingchao001 commented 2 years ago

In GossipSubRouter these are many thread-unsafe maps, such as peers, direct, etc.

type GossipSubRouter struct {
    p        *PubSub
    peers    map[peer.ID]protocol.ID          // peer protocols
    direct   map[peer.ID]struct{}             // direct peers
    mesh     map[string]map[peer.ID]struct{}  // topic meshes
        ....
}

I forked this project (v0.3.5) and it panic randomly due to concurrent map writes issue. Plz see the detail as follows.

fatal error: concurrent map writes

goroutine 602 [running]: runtime.throw(0x262c7fd, 0x15) /usr/local/go/src/runtime/panic.go:1116 +0x72 fp=0xc01db9b5c0 sp=0xc01db9b590 pc=0x4ff272 runtime.mapdelete_faststr(0x22355a0, 0xc00f8b1410, 0xc00fb7e5a0, 0x22) /usr/local/go/src/runtime/map_faststr.go:377 +0x34c fp=0xc01db9b628 sp=0xc01db9b5c0 pc=0x4dc06c chainmaker.org/chainmaker/libp2p-pubsub.(GossipSubRouter).RemovePeer(0xc00f97e000, 0xc00fb7e5a0, 0x22) /root/gocode/pkg/mod/chainmaker.org/chainmaker/libp2p-pubsub@v1.0.0/gossipsub.go:483 +0x138 fp=0xc01db9b700 sp=0xc01db9b628 pc=0x16d1e18 chainmaker.org/chainmaker/libp2p-pubsub.(PubSub).processLoop(0xc0002da1a0, 0x2a19a00, 0xc000044118) /root/gocode/pkg/mod/chainmaker.org/chainmaker/libp2p-pubsub@v1.0.0/pubsub.go:615 +0x1759 fp=0xc01db9bfc8 sp=0xc01db9b700 pc=0x16e4639 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc01db9bfd0 sp=0xc01db9bfc8 pc=0x538661 created by chainmaker.org/chainmaker/libp2p-pubsub.NewPubSub /root/gocode/pkg/mod/chainmaker.org/chainmaker/libp2p-pubsub@v1.0.0/pubsub.go:296 +0xc09

It seems that when removing peer from the peers map, other go routine reads it.

Does anyone see the same problem?

vyzo commented 2 years ago

Are you invoking methods in the router directly? You should never do that; the maps are only accessed in the event loop thread.

On Mon, Jan 17, 2022, 04:49 wanmingchao001 @.***> wrote:

In GossipSubRouter these are many thread-unsafe maps, such as peers, direct, etc. I forked this project (v0.3.5) and it panic randomly due to concurrent map writes issue. Plz see the detail as follows.

fatal error: concurrent map writes

goroutine 602 [running]: runtime.throw(0x262c7fd, 0x15) /usr/local/go/src/runtime/panic.go:1116 +0x72 fp=0xc01db9b5c0 sp=0xc01db9b590 pc=0x4ff272 runtime.mapdelete_faststr(0x22355a0, 0xc00f8b1410, 0xc00fb7e5a0, 0x22) /usr/local/go/src/runtime/map_faststr.go:377 +0x34c fp=0xc01db9b628 sp=0xc01db9b5c0 pc=0x4dc06c

chainmaker.org/chainmaker/libp2p-pubsub.(*GossipSubRouter).RemovePeer(0xc00f97e000, 0xc00fb7e5a0, 0x22) /root/gocode/pkg/mod/ @./gossipsub.go:483 +0x138 fp=0xc01db9b700 sp=0xc01db9b628 pc=0x16d1e18 chainmaker.org/chainmaker/libp2p-pubsub.(PubSub).processLoop(0xc0002da1a0, 0x2a19a00, 0xc000044118) /root/gocode/pkg/mod/ **@./pubsub.go:615 +0x1759 fp=0xc01db9bfc8 sp=0xc01db9b700 pc=0x16e4639 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc01db9bfd0 sp=0xc01db9bfc8 pc=0x538661 created by chainmaker.org/chainmaker/libp2p-pubsub.NewPubSub /root/gocode/pkg/mod/ @./pubsub.go:296 +0xc09

It seems that when removing peer from the peers map, other go routine reads it.

Does anyone see the same problem?

— Reply to this email directly, view it on GitHub https://github.com/libp2p/go-libp2p-pubsub/issues/471, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAI4SRN62E2BGCHN6J4TELUWN7STANCNFSM5MDSITXQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

wanmingchao001 commented 2 years ago

@vyzo I got it. I invoked the RemovePeer() method in other go routine directly, and caused this issue. Thanks.