panjf2000 / gnet

🚀 gnet is a high-performance, lightweight, non-blocking, event-driven networking framework written in pure Go.
https://gnet.host
Apache License 2.0
9.7k stars 1.04k forks source link

[Bug]: conn的系列方法是否应该支持线程安全? #470

Closed NickPak closed 1 year ago

NickPak commented 1 year ago

Actions I've taken before I'm here

What happened?

在Client向Server建立连接的OnOpen回调中,我创建了一个ContextSlice对象,并用它调用了conn.SetContext(cs),为这个conn设置了上下文,然而在用户Goroutine中读取conn.Context()时会发生DATA RACE:

WARNING: DATA RACE
Read at 0x00c0001b4000 by main goroutine:
  github.com/panjf2000/gnet/v2.(*conn).Context()
      go/pkg/mod/github.com/panjf2000/gnet/v2@v2.2.7/connection.go:406 +0x3e

Previous write at 0x00c0001b4000 by goroutine 29:
  github.com/panjf2000/gnet/v2.(*conn).releaseTCP()
      /go/pkg/mod/github.com/panjf2000/gnet/v2@v2.2.7/connection.go:70 +0xde
  github.com/panjf2000/gnet/v2.(*eventloop).closeConn()
      /go/pkg/mod/github.com/panjf2000/gnet/v2@v2.2.7/eventloop.go:229 +0x13bb
  github.com/panjf2000/gnet/v2.(*eventloop).read()
      /go/pkg/mod/github.com/panjf2000/gnet/v2@v2.2.7/eventloop.go:126 +0x2e9

同样的问题也出现在了RemoteAddr()上

Major version of gnet

v2

Specific version of gnet

v2.2.7

Operating system

Linux

Relevant log output

WARNING: DATA RACE
Read at 0x00c0001b4000 by main goroutine:
  github.com/panjf2000/gnet/v2.(*conn).Context()
      go/pkg/mod/github.com/panjf2000/gnet/v2@v2.2.7/connection.go:406 +0x3e

Previous write at 0x00c0001b4000 by goroutine 29:
  github.com/panjf2000/gnet/v2.(*conn).releaseTCP()
      /go/pkg/mod/github.com/panjf2000/gnet/v2@v2.2.7/connection.go:70 +0xde
  github.com/panjf2000/gnet/v2.(*eventloop).closeConn()
      /go/pkg/mod/github.com/panjf2000/gnet/v2@v2.2.7/eventloop.go:229 +0x13bb
  github.com/panjf2000/gnet/v2.(*eventloop).read()
      /go/pkg/mod/github.com/panjf2000/gnet/v2@v2.2.7/eventloop.go:126 +0x2e9

Code snippets (optional)

No response

How to Reproduce

在不同的Goroutine中访问conn.Context()和conn.SetContext(),如果gnet内部调用了releaseTCP(),外部再次访问conn.Context()时会出现DATA RACE

Does this issue reproduce with the latest release?

It can reproduce with the latest release

panjf2000 commented 1 year ago

这些方法因为 gnet 的内部设计的原因,只能是非并发安全的,现在我能做的就是给每一个方法加注释说明其是否是并发安全的。

NickPak commented 1 year ago

是否能考虑将 gnet.Conn 的

type conn struct {
    ctx            interface{}             // user-defined context
       // ...
}
func (c *conn) Context() interface{}       { return c.ctx }
func (c *conn) SetContext(ctx interface{}) { c.ctx = ctx }

更改为

type conn struct {
    ctx            atomic.Value             // user-defined context
       // ...
}
func (c *conn) Context() interface{}       { return c.ctx.Load() }
func (c *conn) SetContext(ctx interface{}) { c.ctx.Store(ctx) }

这样就能解决用户层的问题了,否则好像没有更好的办法来避免用户层使用gnet.Conn

NickPak commented 1 year ago

我现在的做法是,保证conn.ctx成员里面存储的对象本身是线程安全的,但是业务线程在发送消息时会调用conn.Context()方法来获取存储的上下文对象,从而取出Encoder对消息做编码,这个conn.Context()方法就成了唯一不安全的调用了

panjf2000 commented 1 year ago

我现在的做法是,保证conn.ctx成员里面存储的对象本身是线程安全的,但是业务线程在发送消息时会调用conn.Context()方法来获取存储的上下文对象,从而取出Encoder对消息做编码,这个conn.Context()方法就成了唯一不安全的调用了

这样的话其他的如 LocalAddrRemoteAddr 这些都需要处理,而这些方法大多数情况都是在 event-loop 中调用,所以不需要使用任何竞争保护机制,如果都改成 atomic.Value,就是为了少数牺牲多数,得不偿失。

而且,为什么你要在你自己的 goroutine 里调用 c.Context()?你提前取出来然后传递给你的 goroutine 不就行了?

NickPak commented 1 year ago

我目前就是提前取出来然后到处传递它,谢谢

chen2ding commented 10 months ago

多核运行还是有线程安全问题?