cloudwego / kitex

Go RPC framework with high-performance and strong-extensibility for building micro-services.
https://www.cloudwego.io
Apache License 2.0
6.89k stars 802 forks source link

kitex v0.5.x server mux netpoll panic #913

Open succulentxb opened 1 year ago

succulentxb commented 1 year ago

使用Kitex v0.5.1 版本,开启多路复用,server偶尔会有以下panic发生,大家有遇到过类似问题吗? client / server 双端都已开启mux transport 客户端连接数配置 connection=1

GOPOOL: panic in pool: gopool.DefaultPool: runtime error: 
invalid memory address or nil pointer dereference: 

goroutine 1974019 [running]:

panic({0x3337640, 0x5a0e880})
    /usr/local/go/src/runtime/panic.go:838 +0x207
github.com/cloudwego/netpoll.(*LinkBuffer).WriteBuffer(0xc0016fc1e0?, 0xc026839e28?)
    /builds/bdm/bds/quote/.mod_cache/github.com/cloudwego/netpoll@v0.3.2/nocopy_linkbuffer.go:407 +0x33
github.com/cloudwego/netpoll.(*LinkBuffer).Append(0xc026839e18?, {0x3cbfb88?, 0xc03efb9d10?})
    /builds/bdm/bds/quote/.mod_cache/github.com/cloudwego/netpoll@v0.3.2/nocopy_linkbuffer.go:393 +0x32
github.com/cloudwego/netpoll.(*connection).Append(0x16c0175?, {0x3cbfb88?, 0xc03efb9d10?})
    /builds/bdm/bds/quote/.mod_cache/github.com/cloudwego/netpoll@v0.3.2/connection_impl.go:236 +0x2a
github.com/cloudwego/netpoll/mux.(*ShardQueue).deal(0xc03ee932c0, {0xc040e57600, 0x1, 0xc00019c000?})
    /builds/bdm/bds/quote/.mod_cache/github.com/cloudwego/netpoll@v0.3.2/mux/shard_queue.go:180 +0x92
github.com/cloudwego/netpoll/mux.(*ShardQueue).foreach.func1()
    /builds/bdm/bds/quote/.mod_cache/github.com/cloudwego/netpoll@v0.3.2/mux/shard_queue.go:154 +0x152
github.com/bytedance/gopkg/util/gopool.(*worker).run.func1.1(0x16c0112?, 0x334fa60?)
    /builds/bdm/bds/quote/.mod_cache/github.com/bytedance/gopkg@v0.0.0-20220817015305-b879a72dc90f/util/gopool/worker.go:69 +0x66
github.com/bytedance/gopkg/util/gopool.(*worker).run.func1()
    /builds/bdm/bds/quote/.mod_cache/github.com/bytedance/gopkg@v0.0.0-20220817015305-b879a72dc90f/util/gopool/worker.go:70 +0xe5
created by github.com/bytedance/gopkg/util/gopool.(*worker).run
    /builds/bdm/bds/quote/.mod_cache/github.com/bytedance/gopkg@v0.0.0-20220817015305-b879a72dc90f/util/gopool/worker.go:41 +0x56
YangruiEmma commented 1 year ago

有更多触发场景的信息吗

succulentxb commented 1 year ago

有更多触发场景的信息吗

@YangruiEmma 目前没太有思路就是没有稳定的复现路径,在我们的开发环境中,每天大概有几次这样的 panic 发生,服务使用量级大概是百QPS左右,服务的可用资源量也比较充足

jayantxie commented 1 year ago

这个bug应该是shardQueue实现的问题,我正在修复

YangruiEmma commented 1 year ago

@succulentxb 如果是企业用户可以填一下这个表单:https://wenjuan.feishu.cn/m?t=sH0dv11ZVGDi-5kda ,企业用户可以专门跟进

hzxgo commented 1 month ago

我这边也遇到了相同问题。重新场景主要是在重启服务后会遇到 panic,但是之后服务看起来也运行正常(不知是否存在潜在问题,起码业务逻辑跑着正常)。

![Uploading 1.png…]()

hzxgo commented 1 month ago

image