aggregator: race condition and fails to send out data in rare cases

Civil commented 8 years ago

In some rare cases (don't have 100% way to reproduce) there are 2 bugs that happens simultaneously:

Race condition in aggregator.c resulted in invocation->buckets to be freed when it's accessed here: https://github.com/grobian/carbon-c-relay/blob/master/aggregator.c#L311 which results in a segfault.

Here's report from ASAN:

=================================================================
==19051==ERROR: AddressSanitizer: heap-use-after-free on address 0x6030003fc9e0 at pc 0x00000053ed91 bp 0x7f4d8c0c0f70 sp 0x7f4d8c0c0f68
READ of size 8 at 0x6030003fc9e0 thread T2
    #0 0x53ed90 in aggregator_putmetric /usr/src/carbon-c-relay/aggregator.c:311:31
    #1 0x52bb9e in router_route_intern /usr/src/carbon-c-relay/router.c:2899:7
    #2 0x529b5b in router_route /usr/src/carbon-c-relay/router.c:2980:8
    #3 0x503ed9 in dispatch_connection /usr/src/carbon-c-relay/dispatcher.c:404:25
    #4 0x502b97 in dispatch_runner /usr/src/carbon-c-relay/dispatcher.c:585:13
    #5 0x7f4d90ab3493 in start_thread (/lib64/libpthread.so.0+0x7493)
    #6 0x7f4d903d95dc in clone (/lib64/libc.so.6+0xe95dc)

0x6030003fc9e0 is located 16 bytes inside of 32-byte region [0x6030003fc9d0,0x6030003fc9f0)
freed by thread T23 here:
    #0 0x4c1210 in free (/usr/src/carbon-c-relay/relay+0x4c1210)
    #1 0x541b02 in aggregator_expire /usr/src/carbon-c-relay/aggregator.c:555:9
    #2 0x7f4d90ab3493 in start_thread (/lib64/libpthread.so.0+0x7493)

previously allocated by thread T5 here:
    #0 0x4c1518 in __interceptor_malloc (/usr/src/carbon-c-relay/relay+0x4c1518)
    #1 0x53e27f in aggregator_putmetric /usr/src/carbon-c-relay/aggregator.c:257:22
    #2 0x52bb9e in router_route_intern /usr/src/carbon-c-relay/router.c:2899:7
    #3 0x529b5b in router_route /usr/src/carbon-c-relay/router.c:2980:8
    #4 0x503ed9 in dispatch_connection /usr/src/carbon-c-relay/dispatcher.c:404:25
    #5 0x502b97 in dispatch_runner /usr/src/carbon-c-relay/dispatcher.c:585:13
    #6 0x7f4d90ab3493 in start_thread (/lib64/libpthread.so.0+0x7493)

Thread T2 created by T0 here:
    #0 0x42c129 in __interceptor_pthread_create (/usr/src/carbon-c-relay/relay+0x42c129)
    #1 0x5014b9 in dispatch_new /usr/src/carbon-c-relay/dispatcher.c:625:6
    #2 0x501547 in dispatch_new_connection /usr/src/carbon-c-relay/dispatcher.c:654:9
    #3 0x4f5032 in main /usr/src/carbon-c-relay/relay.c:722:21
    #4 0x7f4d9031061f in __libc_start_main (/lib64/libc.so.6+0x2061f)

Thread T23 created by T0 here:
    #0 0x42c129 in __interceptor_pthread_create (/usr/src/carbon-c-relay/relay+0x42c129)
    #1 0x53faff in aggregator_start /usr/src/carbon-c-relay/aggregator.c:655:6
    #2 0x4f52c5 in main /usr/src/carbon-c-relay/relay.c:751:8
    #3 0x7f4d9031061f in __libc_start_main (/lib64/libc.so.6+0x2061f)

Thread T5 created by T0 here:
    #0 0x42c129 in __interceptor_pthread_create (/usr/src/carbon-c-relay/relay+0x42c129)
    #1 0x5014b9 in dispatch_new /usr/src/carbon-c-relay/dispatcher.c:625:6
    #2 0x501547 in dispatch_new_connection /usr/src/carbon-c-relay/dispatcher.c:654:9
    #3 0x4f5032 in main /usr/src/carbon-c-relay/relay.c:722:21
    #4 0x7f4d9031061f in __libc_start_main (/lib64/libc.so.6+0x2061f)

SUMMARY: AddressSanitizer: heap-use-after-free /usr/src/carbon-c-relay/aggregator.c:311:31 in aggregator_putmetric
Shadow bytes around the buggy address:
  0x0c06800778e0: fa fa fa fa fa fa fd fd fd fd fa fa 00 00 00 00
  0x0c06800778f0: fa fa fa fa fa fa fa fa fd fd fd fd fa fa fd fd
  0x0c0680077900: fd fd fa fa 00 00 00 00 fa fa 00 00 00 00 fa fa
  0x0c0680077910: fa fa fa fa fa fa fd fd fd fd fa fa fa fa fa fa
  0x0c0680077920: fa fa fd fd fd fd fa fa fd fd fd fd fa fa 00 00
=>0x0c0680077930: 00 00 fa fa fd fd fd fd fa fa fd fd[fd]fd fa fa
  0x0c0680077940: fd fd fd fd fa fa 00 00 00 00 fa fa 00 00 00 00
  0x0c0680077950: fa fa 00 00 00 00 fa fa 00 00 00 00 fa fa fa fa
  0x0c0680077960: fa fa fa fa fa fa fa fa fa fa 00 00 00 00 fa fa
  0x0c0680077970: fd fd fd fd fa fa 00 00 00 00 fa fa fa fa fa fa
  0x0c0680077980: fa fa 00 00 00 00 fa fa fd fd fd fd fa fa 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==19051==ABORTING

That's because of race condtion that happens here: https://github.com/grobian/carbon-c-relay/blob/master/aggregator.c#L240-L245

Thread1 unlock mutex on line 244, thread2 (expire) gets it and frees the invocation variable that thread1 is currently using and only after that thread2 gets the mutex locked.

Also report from TSAN (also might be useful to get an insight about what's going on): https://gist.github.com/Civil/e4aaffb76be7e216cc4f259e162b70b9

But even if I change lock here https://github.com/grobian/carbon-c-relay/blob/master/aggregator.c#L240 to be write-lock (and remove unlock-lock from lines 244-245) it still fails to send any data out (but at least doesn't segfaults).

Civil commented 8 years ago

Also, no new rules were added, the only change that happend to the config - extra servers in upstream section. SIGHUP caused carbon-c-relay (v2.1 + commits to fix sighup) to exit and after that it started to consume twice amount of CPU and started to constantly crashing with SIGSEGV.

Civil commented 8 years ago

I'll try to fix TSAN errors and will make extra PR for them. The current one should fix only the crash itself.

Civil commented 8 years ago

And also read after free during shutdown:

[2016-07-01 12:00:57] shutting down...
[2016-07-01 12:00:57] closed listeners for port 2003
[2016-07-01 12:00:57] stopped collector
[2016-07-01 12:00:58] stopped aggregator
=================================================================
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==ERROR: AddressSanitizer: heap-use-after-free on address 0x60600000ebb8 at pc 0x00000053dba3 bp 0x7fb5dbae8f70 sp 0x7fb5dbae8f68
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
==20613==AddressSanitizer: while reporting a bug found another one. Ignoring.
READ of size 8 at 0x60600000ebb8 thread T8
    #0 0x53dba2 in aggregator_putmetric /usr/src/carbon-c-relay/aggregator.c:205:13
    #1 0x52bb9e in router_route_intern /usr/src/carbon-c-relay/router.c:2899:7
    #2 0x529b5b in router_route /usr/src/carbon-c-relay/router.c:2980:8
    #3 0x503ed9 in dispatch_connection /usr/src/carbon-c-relay/dispatcher.c:404:25
    #4 0x502b97 in dispatch_runner /usr/src/carbon-c-relay/dispatcher.c:585:13
    #5 0x7fb5e413e493 in start_thread (/lib64/libpthread.so.0+0x7493)
    #6 0x7fb5e3a645dc in clone (/lib64/libc.so.6+0xe95dc)

0x60600000ebb8 is located 24 bytes inside of 64-byte region [0x60600000eba0,0x60600000ebe0)
freed by thread T23 here:
    #0 0x4c1210 in free (/usr/src/carbon-c-relay/relay+0x4c1210)
    #1 0x5422a9 in aggregator_expire /usr/src/carbon-c-relay/aggregator.c:609:3
    #2 0x7fb5e413e493 in start_thread (/lib64/libpthread.so.0+0x7493)

previously allocated by thread T0 here:
    #0 0x4c1518 in __interceptor_malloc (/usr/src/carbon-c-relay/relay+0x4c1518)
    #1 0x53c9f3 in aggregator_new /usr/src/carbon-c-relay/aggregator.c:53:20
    #2 0x5147f8 in router_readconfig /usr/src/carbon-c-relay/router.c:1312:29
    #3 0x4f4525 in main /usr/src/carbon-c-relay/relay.c:630:13
    #4 0x7fb5e399b61f in __libc_start_main (/lib64/libc.so.6+0x2061f)

Thread T8 created by T0 here:
    #0 0x42c129 in __interceptor_pthread_create (/usr/src/carbon-c-relay/relay+0x42c129)
    #1 0x5014b9 in dispatch_new /usr/src/carbon-c-relay/dispatcher.c:625:6
    #2 0x501547 in dispatch_new_connection /usr/src/carbon-c-relay/dispatcher.c:654:9
    #3 0x4f5032 in main /usr/src/carbon-c-relay/relay.c:722:21
    #4 0x7fb5e399b61f in __libc_start_main (/lib64/libc.so.6+0x2061f)

Thread T23 created by T0 here:
    #0 0x42c129 in __interceptor_pthread_create (/usr/src/carbon-c-relay/relay+0x42c129)
    #1 0x53facf in aggregator_start /usr/src/carbon-c-relay/aggregator.c:655:6
    #2 0x4f52c5 in main /usr/src/carbon-c-relay/relay.c:751:8
    #3 0x7fb5e399b61f in __libc_start_main (/lib64/libc.so.6+0x2061f)

SUMMARY: AddressSanitizer: heap-use-after-free /usr/src/carbon-c-relay/aggregator.c:205:13 in aggregator_putmetric
Shadow bytes around the buggy address:
  0x0c0c7fff9d20: fa fa fa fa fa fa fa fa fa fa fa fa fd fd fd fd
  0x0c0c7fff9d30: fd fd fd fd fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c0c7fff9d40: fa fa fa fa fd fd fd fd fd fd fd fa fa fa fa fa
  0x0c0c7fff9d50: fd fd fd fd fd fd fd fa fa fa fa fa fd fd fd fd
  0x0c0c7fff9d60: fd fd fd fd fa fa fa fa fd fd fd fd fd fd fd fd
=>0x0c0c7fff9d70: fa fa fa fa fd fd fd[fd]fd fd fd fd fa fa fa fa
  0x0c0c7fff9d80: 00 00 00 00 00 00 00 00 fa fa fa fa fd fd fd fd
  0x0c0c7fff9d90: fd fd fd fd fa fa fa fa 00 00 00 00 00 00 00 fa
  0x0c0c7fff9da0: fa fa fa fa 00 00 00 00 00 00 00 00 fa fa fa fa
  0x0c0c7fff9db0: 00 00 00 00 00 00 02 fa fa fa fa fa 00 00 00 00
  0x0c0c7fff9dc0: 00 00 00 01 fa fa fa fa 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==20613==ABORTING

Civil commented 8 years ago

It seems that this is related to the amount of incoming connections - the more we have the less stable everything is.

So it might be related to some of the races in dispatcher.

grobian commented 8 years ago

The race condition is? But isn't it fixed with your patch now?

Civil commented 8 years ago

Crash fixed, but aggregator is not sending any data out except for it's own metrics.

It seems that there are two separate problems:

Crash, which was fixed by my patch.
After a simple config change (added more upstream relays) aggregator stopped to send data out and started to consume twice CPU as before.

Second problem is still there. On one host I've worked around that with placing yet another local carbon-c-relay that just sends all the data to aggregator (and moving aggregator to another port).

grobian commented 8 years ago

Well, is the thing under load? The problem is that there is a sweet spot where the load of metrics cause all dispatchers to wait for the lock (which you just made bigger) and basically not do anything any more because the whole thing gets horribly behind with anything it does.

Civil commented 8 years ago

That happens under load, right. Though the load is very different and when I'm getting backtrace I don't see anything waiting on locks, top10 according to pprof:

   16461  31.7%  31.7%    18869  36.3% fnmatch ??:0
    6770  13.0%  44.7%     6770  13.0% aggregator_expire /usr/src/carbon-c-relay/aggregator.c:404
    5213  10.0%  54.7%     5213  10.0% pthread_setcanceltype ??:0
    2763   5.3%  60.1%     2763   5.3% aggregator_expire /usr/src/carbon-c-relay/aggregator.c:403
    1577   3.0%  63.1%     1577   3.0% aggregator_expire /usr/src/carbon-c-relay/aggregator.c:402
    1553   3.0%  66.1%     1573   3.0% __h_errno ??:0
    1073   2.1%  68.1%     1073   2.1% aggregator_expire /usr/src/carbon-c-relay/aggregator.c:399
     970   1.9%  70.0%      970   1.9% dispatch_connection /usr/src/carbon-c-relay/dispatcher.c:387
     790   1.5%  71.5%      790   1.5% dispatch_connection /usr/src/carbon-c-relay/dispatcher.c:438
     671   1.3%  72.8%      671   1.3% regexec ??:0

I can try to get gdb backtrace in that case, but it also seems legit to me.

Also amount of received metrics haven't chagned, but amout of destinations and sources are.

grobian commented 8 years ago

More destinations mean that more work is being done by dispatchers (sending to more targets). aggregator.c is expiry doing expiration on collected buckets, what's the CPU usage of the relay when it does this?

Civil commented 8 years ago

around 150% when it's not stuck, and 400% when it's stuck.

Also sometimes restart can fix a problem.

And anyway, it should send the data, at least some of them. If it's not sending because of load, then there is a deadlock somewhere and that's a bug.

grobian commented 8 years ago

so the expiry thread could possibly be cpu bound

Civil commented 8 years ago

In that case I should actually see some stats from the aggregator that it's expired everything, for example. And also why restart helps?

grobian commented 8 years ago

it will actually pile up things, because it can't expire in time, and since that happens, metrics may be dropped at the front door because there is not enough buckets to hold the future

a restart helps in such scenario because it clears all aggregations

I've seen this happening with dynamic aggregates where the invocations are slowly increasing, at the point where expiry no longer makes it in time to release buckets things go haywire.

Civil commented 8 years ago

Yeah, that makes sense. I'll try to shift some load from it then.

grobian commented 8 years ago

that's what I ended up doing yeah

Civil commented 8 years ago

Well, seems to be fixed. Except that TSAN errors still should be addressed, cause all of them are errors that can potentially produce corrupted data in aggregator.

If you want I can open separate bug about TSAN errors and ASAN overflow during exit.

grobian commented 8 years ago

please do

grobian / carbon-c-relay

aggregator: race condition and fails to send out data in rare cases #193