openzipkin / zipkin-go

Zipkin distributed tracing library for go.
Apache License 2.0
612 stars 114 forks source link

fatal error: concurrent map iteration and map write #217

Closed shanliu closed 6 months ago

shanliu commented 11 months ago

fatal error: concurrent map iteration and map write

goroutine 55604 [running]: reflect.mapiternext(0x49fa8f?) /usr/lib/golang/src/runtime/map.go:1380 +0x19 reflect.(MapIter).Next(0xc000066c20?) /usr/lib/golang/src/reflect/value.go:1924 +0x7e encoding/json.mapEncoder.encode({0xc000066ce0?}, 0xc0004cee80, {0xbb4900?, 0xc000266530?, 0x6?}, {0x7?, 0x0?}) /usr/lib/golang/src/encoding/json/encode.go:797 +0x33e encoding/json.structEncoder.encode({{{0xc000286900?, 0xc000066e68?, 0x4160d0?}, 0xc00029cd20?}}, 0xc0004cee80, {0xbf7ce0?, 0xc000266460?, 0x4383d6?}, {0x0, 0x1}) /usr/lib/golang/src/encoding/json/encode.go:759 +0x1f4 encoding/json.ptrEncoder.encode({0x414ba5?}, 0xc0004cee80, {0xb6e780?, 0xc000266460?, 0xb6e780?}, {0xbf?, 0x1?}) /usr/lib/golang/src/encoding/json/encode.go:943 +0x21c encoding/json.(encodeState).reflectValue(0xc000066f98?, {0xb6e780?, 0xc000266460?, 0x7fb0a34a0108?}, {0x20?, 0x57?}) /usr/lib/golang/src/encoding/json/encode.go:358 +0x78 encoding/json.(encodeState).marshal(0x2?, {0xb6e780?, 0xc000266460?}, {0x99?, 0xc4?}) /usr/lib/golang/src/encoding/json/encode.go:330 +0xfa encoding/json.Marshal({0xb6e780, 0xc000266460}) /usr/lib/golang/src/encoding/json/encode.go:161 +0xe5 github.com/openzipkin/zipkin-go/model.SpanModel.MarshalJSON({{{0x0, 0xec061f7ddeaaf40}, 0x73ff63cecd0bba2a, 0xc00003f118, 0x0, 0xc000612b5c, {0x0, 0x0}, {0x0, 0x0}}, ...}) /root/go/pkg/mod/github.com/openzipkin/zipkin-go@v0.4.0/model/span.go:121 +0x2c9 gitlab.xx.com/hsbproject/hsbcommon/hsbservice.(HsbListFileZipkinReporter).Send(0xc000012100, {{{0x0, 0xec061f7ddeaaf40}, 0x73ff63cecd0bba2a, 0xc00003f118, 0x0, 0xc000612b5c, {0x0, 0x0}, {0x0, ...}}, ...}) /root/go/pkg/mod/gitlab.xx.com/hsbproject/hsbcommon@v0.1.137/hsbservice/zipkin.go:19 +0x5f github.com/openzipkin/zipkin-go.(spanImpl).Finish(0xd95f60?) /root/go/pkg/mod/github.com/openzipkin/zipkin-go@v0.4.0/span_implementation.go:83 +0xf7 github.com/openzipkin/zipkin-go/middleware/http.(transport).RoundTrip(0xc00028ad20, 0xc0005ddc00) /root/go/pkg/mod/github.com/openzipkin/zipkin-go@v0.4.0/middleware/http/transport.go:217 +0x10d3 net/http.send(0xc0005ddc00, {0xd8d6a0, 0xc00028ad20}, {0x8?, 0xc54c80?, 0x0?}) /usr/lib/golang/src/net/http/client.go:252 +0x5f7 net/http.(Client).send(0xc000618fc0, 0xc0005ddc00, {0x4160d0?, 0xc000067908?, 0x0?}) /usr/lib/golang/src/net/http/client.go:176 +0x9b net/http.(Client).do(0xc000618fc0, 0xc0005ddc00) /usr/lib/golang/src/net/http/client.go:716 +0x8fb net/http.(Client).Do(...) /usr/lib/golang/src/net/http/client.go:582 github.com/openzipkin/zipkin-go/middleware/http.(Client).DoWithAppSpan(0xc0002b8f00, 0xc0005ddb00, {0xc941a9, 0x15}) /root/go/pkg/mod/github.com/openzipkin/zipkin-go@v0.4.0/middleware/http/client.go:123 +0x32c gitlab.xx.com/hsbproject/hsbcommon/hsbmodule.(HsbRestBuild).executeRequest(0xc000533b80, {0xd925e0, 0xc0004db300?}, 0xc0003146e0, {0xc000360780, 0x39}, 0xc00023e6f0?, {0xd94398, 0x12794b8}, 0xc000620e10, ...) /root/go/pkg/mod/gitlab.xx.com/hsbproject/hsbcommon@v0.1.137/hsbmodule/hsb_client.go:187 +0x899 gitlab.xx.com/hsbproject/hsbcommon/hsbmodule.(HsbRestBuild).BuildRequest(0xc000533b80, {0xd925e0, 0xc0004db300}, 0x9f2c20?, 0xd96f90?, {0xbb4540, 0xc000620db0}, 0x9f26f6?) /root/go/pkg/mod/gitlab.xx.com/hsbproject/hsbcommon@v0.1.137/hsbmodule/hsb_client.go:131 +0x527 github.com/hsbteam/rest_client.(RestClient).Do.func1() /root/go/pkg/mod/gitlab.xx.com/hsbproject/rest_client@v0.0.53/rest_client.go:148 +0x83 created by github.com/hsbteam/rest_client.(RestClient).Do /root/go/pkg/mod/gitlab.xx.com/hsbproject/rest_client@v0.0.53/rest_client.go:141 +0x1e5

basvanbeek commented 11 months ago

I'd like to understand why tags are still being modified if already set to completion. Can this race condition from outside of the library be fixed at the source. We're being asked here (#218) to guard against external race conditions at the expense of performance of the library.

php-lsys commented 11 months ago

When the timeout and cancel occur at the same time in the context, it is difficult to find the place where the tag is triggered.@basvanbeek

php-lsys commented 11 months ago

There is a small probability that panic will be triggered when using the following code

//gCtx *gin.Context
//req *http.Request
//"github.com/openzipkin/zipkin-go/middleware/http"
httpClient, _ := http.NewClient(tracer, zipkinhttp.WithClient(&http.Client{
}), zipkinhttp.ClientTrace(true))
req = req.WithContext(gCtx.Request.Context())
resp, err := httpClient.DoWithAppSpan(req, "***")

@basvanbeek But I'm not sure where the " tags " change was made.

php-lsys commented 11 months ago

I'd like to understand why tags are still being modified if already set to completion. Can this race condition from outside of the library be fixed at the source. We're being asked here (#218) to guard against external race conditions at the expense of performance of the library.

Or is it possible to put the following interface

type Reporter interface {
    Send(model.SpanModel) // Send Span data to the reporter
    Close() error // Close the reporter
}

Modify this to

type Reporter interface {
    Send(model.SpanModel,mtx sync.RWMutex) // Send Span data to the reporter.
    Close() error // Close the reporter
}

I'm doing RLock when I implement the Reporter.

shanliu commented 11 months ago

image https://github.com/openzipkin/zipkin-go/blob/e84b2cf6d2d915fe0ee57c2dc4d736ec13a2ef6a/middleware/http/transport.go#L180-L195

The above callback function may still be triggered when the following error occurs: https://github.com/openzipkin/zipkin-go/blob/e84b2cf6d2d915fe0ee57c2dc4d736ec13a2ef6a/middleware/http/transport.go#L217-L218

This will eventually lead to the occurrence of: fatal error: concurrent map iteration and map write @basvanbeek

codefromthecrypt commented 6 months ago

should be fixed in v0.4.3. please comment if this didn't solve it