Closed Dieterbe closed 8 years ago
no luck yet. in raintank-docker i'm trying to fill up a bunch of data ( up to 3 days back) and then suddenly launch a similar request workload as seen in prod:
+./fake_metrics_to_nsq -keys-per-org 100 -orgs 10 -statsd-addr statsdaemon:8125 -nsqd-tcp-address nsqd:4150 -offset 3d -speedup 400 -stop-at-now &> /var/log/raintank/fake.log
+./fake_metrics_to_nsq -keys-per-org 100 -orgs 10 -statsd-addr statsdaemon:8125 -nsqd-tcp-address nsqd:4150 &> /var/log/raintank/fake2.log &
followed by:
$ cat repro-prod-issue.sh
#!/bin/sh
./inspect-es -es-index metric -format vegeta-mt -from 3d -es-addr elasticsearch:9200 > mc.txt
./inspect-es -es-index metric -format vegeta-mt -from 9min -es-addr elasticsearch:9200 > m.txt
ls -alh mc.txt m.txt
cat mc.txt | sed 's#18763#6063#' | ./vegeta attack -duration 100s -rate 500 > mc.out &
cat m.txt | sed 's#18763#6063#' | ./vegeta attack -duration 100s -rate 5000 > m.out &
echo "launched!"
sleep 110
but MT handles this just fine and memory only goes up by a gig or so. maybe cause i'm working with a smaller amount of metrics here (1k vs 500k in prod). it would take too long to on-demand backload a similar volume on my laptop and since there's no caching going on (at least not in MT) requesting the same data multiple times should work just as well. notably in prod:
maybe i can do a qa run with similar-to-prod metric volume.
now i'm trying something else: i suspect that maybe the index for the new graphite endpoint introduces too many new pointers and keeps the GC busier so using basically a steady instream like so:
+./fake_metrics_to_nsq -keys-per-org 100 -orgs 100 -statsd-addr statsdaemon:8125 -nsqd-tcp-address nsqd:4150 -offset 3h -speedup 10 -stop-at-now &> /var/log/raintank/fake.log
+./fake_metrics_to_nsq -keys-per-org 100 -orgs 100 -statsd-addr statsdaemon:8125 -nsqd-tcp-address nsqd:4150 &> /var/log/raintank/fake2.log &
and just running metric-tank with +GODEBUG=gctrace=1 ./metric_tank
with a patched GC like so:
diff --git a/src/runtime/mgc.go b/src/runtime/mgc.go
index 94301c6..b0d3528 100644
--- a/src/runtime/mgc.go
+++ b/src/runtime/mgc.go
@@ -1297,6 +1297,7 @@ func gcMarkTermination() {
if work.mode != gcBackgroundMode {
print(" (forced)")
}
+ print(" sw ", gcController.scanWork, " bgscancred ", gcController.bgScanCredit)
print("\n")
printunlock()
}
latest master: sys dash: https://snapshot.raintank.io/dashboard/snapshot/D1A8qzn673G2v01syp32bxlZ516cH8qb MT dash: https://snapshot.raintank.io/dashboard/snapshot/j5hPeum00o8Itpa8234TYC8cV2bfKgkz
dd69883 (pre graphite endpoint): MT dash: https://snapshot.raintank.io/dashboard/snapshot/o8J1vnIUnjSNV11wmo2hw2uc10XHh7zA sys dash:https://snapshot.raintank.io/dashboard/snapshot/ta3trxmlqTPL1I0l1vaY4Z7LE82dOQP3
unfortunately all cpu/mem looks identical. i also compared the GC data using gcvis (patched for the 2 new params) and everything looks identical, also the timings :(
a85f3af (previous stable in prod, aka pkg-0.1.0-1457484036ubuntu1) MT: https://snapshot.raintank.io/dashboard/snapshot/KU1EmofIUNT6EusRvPrbNkvwvIIcwF5v sys: https://snapshot.raintank.io/dashboard/snapshot/1jDols6042Vx2CLPotc5QQqt2NRvpa1T looks the same again.for the record, attached the gc view which also looks the same
been doing more testing using a new mt4 instance that's not in the pool and that has the "broken" MT installed. then gor to replicate prod traffic to mt4
dieter@metric-tank-2-prod:~$ sudo ./gor --input-raw :18763 --output-http metric-tank-4-prod:18763
also ran loops like
while true ;do let i=i+1; echo $i; curl 'http://localhost:18763/debug/pprof/profile?seconds=1' > mt4-cpu-$i; done
while true ;do let i=i+1; echo $i; curl http://localhost:18763/debug/pprof/heap > mt4-heap-$i; done
to get profiles right before crash. unfortunately those don't seem to show anything interesting.
~ ❯❯❯ go tool pprof --inuse_space mt4-bin mt4-heap-2119
Entering interactive mode (type "help" for commands)
(pprof) top50 -cum
2800.22MB of 2823.27MB total (99.18%)
Dropped 195 nodes (cum <= 14.12MB)
flat flat% sum% cum cum%
0 0% 0% 2823.27MB 100% runtime.goexit
0 0% 0% 1844.66MB 65.34% github.com/nsqio/go-nsq.(*Consumer).handlerLoop
0 0% 0% 1844.66MB 65.34% main.(*Handler).HandleMessage
27MB 0.96% 0.96% 1093.60MB 38.74% main.(*AggMetrics).GetOrCreate
760.09MB 26.92% 27.88% 1066.60MB 37.78% main.NewAggMetric
0 0% 27.88% 968.06MB 34.29% main.main
0 0% 27.88% 968.06MB 34.29% runtime.main
104.01MB 3.68% 31.56% 900.58MB 31.90% main.NewAggregator
0 0% 31.56% 709.56MB 25.13% main.(*AggMetric).Add
0 0% 31.56% 586.57MB 20.78% main.(*DefCache).Backfill
0 0% 31.56% 586.57MB 20.78% main.NewDefCache
0 0% 31.56% 530.07MB 18.78% github.com/raintank/raintank-metric/metricdef.(*DefsEs).GetMetrics
85.51MB 3.03% 34.59% 530.07MB 18.78% github.com/raintank/raintank-metric/schema.MetricDefinitionFromJSON
177.51MB 6.29% 40.88% 516.03MB 18.28% main.NewChunk
0 0% 40.88% 476.52MB 16.88% main.(*AggMetric).addAggregators
0 0% 40.88% 476.52MB 16.88% main.(*Aggregator).Add
0 0% 40.88% 476.52MB 16.88% main.(*Aggregator).flush
0 0% 40.88% 444.56MB 15.75% encoding/json.Unmarshal
0 0% 40.88% 439.56MB 15.57% encoding/json.(*decodeState).object
0 0% 40.88% 439.56MB 15.57% encoding/json.(*decodeState).unmarshal
0 0% 40.88% 439.56MB 15.57% encoding/json.(*decodeState).value
381.48MB 13.51% 54.39% 381.48MB 13.51% main.NewCassandraStore
338.52MB 11.99% 66.38% 338.52MB 11.99% github.com/dgryski/go-tsz.New
203.51MB 7.21% 73.59% 203.51MB 7.21% fmt.Sprintf
26MB 0.92% 74.51% 191.53MB 6.78% github.com/dgryski/go-tsz.(*Series).Push
0 0% 74.51% 191.53MB 6.78% main.(*Chunk).Push
0 0% 74.51% 181.55MB 6.43% reflect.Value.SetMapIndex
181.55MB 6.43% 80.94% 181.55MB 6.43% reflect.mapassign
167.53MB 5.93% 86.87% 167.53MB 5.93% github.com/dgryski/go-tsz.(*bstream).writeBits
0 0% 86.87% 158.51MB 5.61% encoding/json.(*decodeState).literal
158.51MB 5.61% 92.49% 158.51MB 5.61% encoding/json.(*decodeState).literalStore
0 0% 92.49% 86MB 3.05% encoding/json.(*decodeState).array
56.50MB 2.00% 94.49% 56.50MB 2.00% main.(*DefCache).Backfill.func1
0 0% 94.49% 48.50MB 1.72% reflect.MakeSlice
48.50MB 1.72% 96.21% 48.50MB 1.72% reflect.unsafe_NewArray
0 0% 96.21% 35.50MB 1.26% reflect.MakeMap
35.50MB 1.26% 97.47% 35.50MB 1.26% reflect.makemap
0 0% 97.47% 33MB 1.17% github.com/raintank/raintank-metric/msg.(*MetricData).DecodeMetricData
1.50MB 0.053% 97.52% 33MB 1.17% github.com/raintank/raintank-metric/schema.(*MetricDataArray).UnmarshalMsg
1.50MB 0.053% 97.57% 31.50MB 1.12% github.com/raintank/raintank-metric/schema.(*MetricData).UnmarshalMsg
30MB 1.06% 98.63% 30MB 1.06% github.com/tinylib/msgp/msgp.ReadStringBytes
0 0% 98.63% 15.50MB 0.55% reflect.Value.Convert
15.50MB 0.55% 99.18% 15.50MB 0.55% reflect.cvtBytesString
(pprof)
~ ❯❯❯ go tool pprof mt4-bin mt4-cpu-60
Entering interactive mode (type "help" for commands)
(pprof) top50 -cum
3.52s of 3.54s total (99.44%)
Dropped 6 nodes (cum <= 0.02s)
flat flat% sum% cum cum%
0 0% 0% 3.54s 100% runtime.goexit
0 0% 0% 2.85s 80.51% main.Get
0 0% 0% 2.85s 80.51% main.get.func1
0.26s 7.34% 7.34% 2.85s 80.51% main.graphiteRaintankJSON
0 0% 7.34% 2.85s 80.51% net/http.(*ServeMux).ServeHTTP
0 0% 7.34% 2.85s 80.51% net/http.(*conn).serve
0 0% 7.34% 2.85s 80.51% net/http.HandlerFunc.ServeHTTP
0 0% 7.34% 2.85s 80.51% net/http.serverHandler.ServeHTTP
0.06s 1.69% 9.04% 1.47s 41.53% strconv.AppendFloat
0.11s 3.11% 12.15% 1.41s 39.83% strconv.genericFtoa
0.19s 5.37% 17.51% 1.30s 36.72% strconv.bigFtoa
0 0% 17.51% 1.29s 36.44% runtime.growslice
1.06s 29.94% 47.46% 1.06s 29.94% runtime.memmove
0 0% 47.46% 0.76s 21.47% runtime.growslice_n
0 0% 47.46% 0.69s 19.49% runtime.gcBgMarkWorker
0 0% 47.46% 0.69s 19.49% runtime.gcDrain
0.41s 11.58% 59.04% 0.68s 19.21% runtime.scanobject
0.08s 2.26% 61.30% 0.68s 19.21% strconv.formatDigits
0.19s 5.37% 66.67% 0.60s 16.95% strconv.fmtF
0.32s 9.04% 75.71% 0.32s 9.04% runtime.duffzero
0.29s 8.19% 83.90% 0.29s 8.19% runtime.memclr
0.07s 1.98% 85.88% 0.21s 5.93% strconv.AppendUint
0.13s 3.67% 89.55% 0.18s 5.08% runtime.greyobject
0.09s 2.54% 92.09% 0.14s 3.95% strconv.formatBits
0.09s 2.54% 94.63% 0.09s 2.54% runtime.heapBitsForObject
0.05s 1.41% 96.05% 0.07s 1.98% strconv.(*decimal).Assign
0 0% 96.05% 0.05s 1.41% runtime.heapBits.setMarked
0.05s 1.41% 97.46% 0.05s 1.41% runtime/internal/atomic.Or8
0.05s 1.41% 98.87% 0.05s 1.41% strconv.(*decimal).Round
0 0% 98.87% 0.03s 0.85% runtime.mallocgc
0 0% 98.87% 0.03s 0.85% runtime.rawmem
0 0% 98.87% 0.03s 0.85% runtime.systemstack
0 0% 98.87% 0.02s 0.56% runtime.heapBits.initSpan
0 0% 98.87% 0.02s 0.56% runtime.largeAlloc
0 0% 98.87% 0.02s 0.56% runtime.mallocgc.func3
0.02s 0.56% 99.44% 0.02s 0.56% strconv.trim
another experiment, bisecting locally compiled MT instances:
sudo service metric_tank stop; sudo cp metric_tank /usr/sbin/metric_tank ; sudo service metric_tank start; ps aux | grep metric_tank
allocate less + execute faster by doing "usable" logic inline
looking good, not much slowdown https://snapshot.raintank.io/dashboard/snapshot/NXWILw7u6i2ClrjNCfNG5kWbJB31yDE1fix(): be more clever about how to fill the output slice
instant crash https://snapshot.raintank.io/dashboard/snapshot/5dOvFzCUd97Kz1Un7r1MO4DQENJte6vDcompute avg inline instead of iterating twice
solid https://snapshot.raintank.io/dashboard/snapshot/kun7X4w1c36Ath4TzZ3RSWK56jXJdFYgadd benchmark for fix()
: OK
https://snapshot.raintank.io/dashboard/snapshot/w2P1v3B8zaNMEc6iAQcr4C2VyCcOiyT2fix(): be more clever about how to fill the output slice
again:
another confirmed instant crash!
https://snapshot.raintank.io/dashboard/snapshot/7OqCkXG3Myc508MaDBelgle29Y8Zas1Wok so we now know 6e6cb96 is the culprit, or somehow triggers a bug elsewhere.
8d18f34313097fbda66bfd99a321267790a34df5 disable colors in metric_tank's logs
-> instant crash https://snapshot.raintank.io/dashboard/snapshot/i8CzC9yqctFBkHHWpkLN7NvnsbTTeCGf
same but 6e6cb96 reverted: works fine again : https://snapshot.raintank.io/dashboard/snapshot/qeM4pfKB8ret4vwVuA7F7b032u2vve4u
I also tried some patches, all visible here https://gist.github.com/Dieterbe/e9414e91564f185ea7c7 specifically:
o
changes over time -> this showed that for each request 0 just went up and then fix returned. it never stayed the same value (confirming again not stuck in busy loop)the focus on fix was due to #171 and also because that's what got changed in 6e6cb96
got another mem profile, this time with
+ runtime.MemProfileRate = 1
this shows:
(pprof) top50 -cum
25753346.64kB of 26036249.83kB total (98.91%)
Dropped 633 nodes (cum <= 130181.25kB)
flat flat% sum% cum cum%
0 0% 0% 26035269.52kB 100% runtime.goexit
0 0% 0% 20136622.62kB 77.34% main.getTargets.func1
0 0% 0% 20136621.12kB 77.34% main.getTarget
20134415.44kB 77.33% 77.33% 20134415.44kB 77.33% main.fix
3.72kB 1.4e-05% 77.33% 3419776.31kB 13.13% net/http.(*conn).serve
0 0% 77.33% 3416172.34kB 13.12% net/http.(*ServeMux).ServeHTTP
0 0% 77.33% 3416172.34kB 13.12% net/http.HandlerFunc.ServeHTTP
0 0% 77.33% 3416172.34kB 13.12% net/http.serverHandler.ServeHTTP
48.81kB 0.00019% 77.33% 3416147.31kB 13.12% main.Get
0 0% 77.33% 3416147.31kB 13.12% main.get.func1
88.34kB 0.00034% 77.33% 3411948.09kB 13.10% main.graphiteRaintankJSON
0 0% 77.33% 3411709.25kB 13.10% strconv.AppendFloat
0 0% 77.33% 3411709.25kB 13.10% strconv.bigFtoa
3411709.25kB 13.10% 90.44% 3411709.25kB 13.10% strconv.fmtF
0 0% 90.44% 3411709.25kB 13.10% strconv.formatDigits
0 0% 90.44% 3411709.25kB 13.10% strconv.genericFtoa
0 0% 90.44% 1467987.61kB 5.64% github.com/nsqio/go-nsq.(*Consumer).handlerLoop
0 0% 90.44% 1467982.89kB 5.64% main.(*Handler).HandleMessage
27464.53kB 0.11% 90.54% 1144988.02kB 4.40% main.(*AggMetrics).GetOrCreate
800744.55kB 3.08% 93.62% 1117522.61kB 4.29% main.NewAggMetric
0.33kB 1.3e-06% 93.62% 1004349.55kB 3.86% main.main
0 0% 93.62% 1004349.55kB 3.86% runtime.main
105592.69kB 0.41% 94.02% 950334.19kB 3.65% main.NewAggregator
0.16kB 6e-07% 94.02% 613466.55kB 2.36% main.NewDefCache
0 0% 94.02% 613466.39kB 2.36% main.(*DefCache).Backfill
0 0% 94.02% 556115.67kB 2.14% github.com/raintank/raintank-metric/metricdef.(*DefsEs).GetMetrics
97334.16kB 0.37% 94.40% 556110.27kB 2.14% github.com/raintank/raintank-metric/schema.MetricDefinitionFromJSON
0 0% 94.40% 458785.92kB 1.76% encoding/json.Unmarshal
0 0% 94.40% 454360.55kB 1.75% encoding/json.(*decodeState).object
0 0% 94.40% 454360.55kB 1.75% encoding/json.(*decodeState).unmarshal
0 0% 94.40% 454360.55kB 1.75% encoding/json.(*decodeState).value
390641.62kB 1.50% 95.90% 390749.15kB 1.50% main.NewCassandraStore
0.22kB 8.4e-07% 95.90% 293175.67kB 1.13% main.(*AggMetric).Add
211988.97kB 0.81% 96.71% 212170.38kB 0.81% fmt.Sprintf
70345.72kB 0.27% 96.98% 211037.25kB 0.81% main.NewChunk
0 0% 96.98% 194668.31kB 0.75% reflect.Value.SetMapIndex
194668.31kB 0.75% 97.73% 194668.31kB 0.75% reflect.mapassign
0 0% 97.73% 167608.80kB 0.64% encoding/json.(*decodeState).literal
167608.39kB 0.64% 98.37% 167608.80kB 0.64% encoding/json.(*decodeState).literalStore
0 0% 98.37% 147612.88kB 0.57% main.(*AggMetric).addAggregators
0 0% 98.37% 147612.88kB 0.57% main.(*Aggregator).Add
0 0% 98.37% 147612.88kB 0.57% main.(*Aggregator).flush
140691.44kB 0.54% 98.91% 140691.53kB 0.54% github.com/dgryski/go-tsz.New
(pprof) list main.fix
Total: 24.83GB
ROUTINE ======================== main.fix in /home/dieter/go/src/github.com/raintank/raintank-metric/metric_tank/dataprocessor.go
19.20GB 19.20GB (flat, cum) 77.33% of Total
. . 51: start = from + interval - remain
. . 52: }
. . 53:
. . 54: // last point should be the last value that divides by interval lower than to (because to is always exclusive)
. . 55: lastPoint := (to - 1) - ((to - 1) % interval)
19.20GB 19.20GB 56: out := make([]schema.Point, (lastPoint-start)/interval+1)
. . 57:
. . 58: // t is the ts we're looking to fill.
. . 59: // i iterates in
. . 60: // o iterates out
. . 61: for t, i, o := start, 0, -1; t <= lastPoint; t += interval {
(pprof)
really weird that that line is the culprit since it's very similar to the original line.
i see two possibilities: 1) stack escape analysis treats this slice allocation different as before
2) somewhere further down the read path the slice leaks and is not reclaimed. but the leak would have to be caused somehow by the perf improvement.. which seems a bit unlikely. also i don't see where it could leak. divide() drops references to it. consolidate() passes it on, and ultimately after graphiteJSON/graphiteRaintankJSON we should have no more references to this slice.
$ b -gcflags='-m' 2>&1 | grep dataprocessor.go:56
./dataprocessor.go:56: make([]schema.Point, (lastPoint - start) / interval + 1) escapes to heap
./dataprocessor.go:56: make([]schema.Point, (lastPoint - start) / interval + 1) escapes to heap
with patch reverted:
$ b -gcflags='-m' 2>&1 | grep dataprocessor.go:39
./dataprocessor.go:39: make([]schema.Point, 0, len(in)) escapes to heap
./dataprocessor.go:39: make([]schema.Point, 0, len(in)) escapes to heap
aha! I think i found the bug. fix() is getting incoming requests such as from 1458966244 to 1458966274 (e.g. a 30s span) but interval is 60s and len becomes 71582788 see http://play.golang.org/p/VSE-srJYOv
the input slices seem to be empty in these cases btw:
+fmt.Println(">fix", "f", from, "t", to, "int", interval, " len", (lastPoint-start)/interval+1)
>fix [{0 1458967095} {0 1458967105} {0 1458967115}] f 1458967095 t 1458967125 int 10 len 3
>fix [] f 1458967095 t 1458967125 int 10 len 3
>fix [] f 1458967095 t 1458967125 int 60 len 71582788
>fix [{0 1458967095} {0 1458967105} {0 1458967115}] f 1458967095 t 1458967125 int 10 len 3
similar issue was #164
it's the first time seeing this in prod, when deploying f040a9e5741ca16c84ebc763314f7b4391c98900 with
agg-settings = 10min:6h:2:38d:true,2h:6h:2:120d:true
, not sure it's because of these changes or because the load pattern has changed (increased) due to recent extra signups:i've seen this twice:
MT dash: https://grafana-monitor.raintank.io/dashboard/snapshot/IWAPH7sho06f0n0TgBmdWsuQykwZfVmM
sys dash: https://grafana-monitor.raintank.io/dashboard/snapshot/Fv7Rb3G0yR5enyZF7oS2kF2iyMMaBPIW
will try to repro in QA hypothesis:maybe extensive GC workload maxing out cpu and blocking progress, but mem usage shouldn't be that high i think. gctrace will help.