FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.29k stars 1.25k forks source link

Thread Sanitizer data race in rcu code #16244

Closed donaldsharp closed 2 months ago

donaldsharp commented 3 months ago

Description

WARNING: ThreadSanitizer: data race (pid=84571)
  Atomic read of size 4 at 0x7b200000bd18 by thread T1:
    #0 seqlock_timedwait lib/seqlock.c:204 (libfrr.so.0+0x515d55)
    #1 rcu_main lib/frrcu.c:437 (libfrr.so.0+0x349397)

  Previous write of size 8 at 0x7b200000bd18 by main thread:
    [failed to restore the stack]

  Location is heap block of size 128 at 0x7b200000bd00 allocated by main thread:
    #0 calloc ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:667 (libtsan.so.2+0x3fdd2)
    #1 qcalloc lib/memory.c:105 (libfrr.so.0+0x3f5980)
    #2 rcu_thread_new lib/frrcu.c:160 (libfrr.so.0+0x347b8f)
    #3 rcu_thread_prepare lib/frrcu.c:187 (libfrr.so.0+0x347e35)
    #4 frr_pthread_run lib/frr_pthread.c:196 (libfrr.so.0+0x35308a)
    #5 bgp_pthreads_run bgpd/bgpd.c:8500 (bgpd+0x9dd5c3)
    #6 main bgpd/bgp_main.c:550 (bgpd+0x49ba2d)

  Thread T1 'RCU sweeper' (tid=84574, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1001 (libtsan.so.2+0x63a59)
    #1 rcu_start lib/frrcu.c:354 (libfrr.so.0+0x348afb)
    #2 rcu_thread_prepare lib/frrcu.c:182 (libfrr.so.0+0x347d97)
    #3 frr_pthread_run lib/frr_pthread.c:196 (libfrr.so.0+0x35308a)
    #4 bgp_pthreads_run bgpd/bgpd.c:8500 (bgpd+0x9dd5c3)
    #5 main bgpd/bgp_main.c:550 (bgpd+0x49ba2d)

SUMMARY: ThreadSanitizer: data race lib/seqlock.c:204 in seqlock_timedwait
==================
==================
WARNING: ThreadSanitizer: data race (pid=84571)
  Write of size 8 at 0x7b1800000020 by thread T1:
    #0 free ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:706 (libtsan.so.2+0x4a3fe)
    #1 qfree lib/memory.c:130 (libfrr.so.0+0x3f5b99)
    #2 rcu_do lib/frrcu.c:378 (libfrr.so.0+0x348eff)
    #3 rcu_main lib/frrcu.c:448 (libfrr.so.0+0x349581)

  Previous atomic write of size 8 at 0x7b1800000020 by main thread:
    #0 atomlist_add_tail lib/atomlist.c:76 (libfrr.so.0+0x29caf1)
    #1 rcu_heads_add_tail lib/frrcu.c:49 (libfrr.so.0+0x347435)
    #2 rcu_bump lib/frrcu.c:255 (libfrr.so.0+0x348233)
    #3 rcu_bump_maybe lib/frrcu.c:295 (libfrr.so.0+0x348383)
    #4 rcu_read_unlock lib/frrcu.c:320 (libfrr.so.0+0x348795)
    #5 fd_poll lib/event.c:894 (libfrr.so.0+0x58d1c3)
    #6 event_fetch lib/event.c:1855 (libfrr.so.0+0x59759a)
    #7 frr_run lib/libfrr.c:1216 (libfrr.so.0+0x3b36be)
    #8 main bgpd/bgp_main.c:551 (bgpd+0x49baa2)

  Thread T1 'RCU sweeper' (tid=84574, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1001 (libtsan.so.2+0x63a59)
    #1 rcu_start lib/frrcu.c:354 (libfrr.so.0+0x348afb)
    #2 rcu_thread_prepare lib/frrcu.c:182 (libfrr.so.0+0x347d97)
    #3 frr_pthread_run lib/frr_pthread.c:196 (libfrr.so.0+0x35308a)
    #4 bgp_pthreads_run bgpd/bgpd.c:8500 (bgpd+0x9dd5c3)
    #5 main bgpd/bgp_main.c:550 (bgpd+0x49ba2d)

SUMMARY: ThreadSanitizer: data race lib/memory.c:130 in qfree
==================
==================
WARNING: ThreadSanitizer: data race (pid=84571)
  Write of size 8 at 0x7b1800000040 by thread T1:
    #0 free ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:706 (libtsan.so.2+0x4a3fe)
    #1 qfree lib/memory.c:130 (libfrr.so.0+0x3f5b99)
    #2 rcu_do lib/frrcu.c:378 (libfrr.so.0+0x348eff)
    #3 rcu_main lib/frrcu.c:448 (libfrr.so.0+0x349581)

  Previous atomic write of size 8 at 0x7b1800000040 by main thread (mutexes: write M21):
    #0 atomlist_add_tail lib/atomlist.c:76 (libfrr.so.0+0x29caf1)
    #1 rcu_heads_add_tail lib/frrcu.c:49 (libfrr.so.0+0x347435)
    #2 rcu_enqueue lib/frrcu.c:515 (libfrr.so.0+0x3499dd)
    #3 zlog_file_target_free lib/zlog_targets.c:168 (libfrr.so.0+0x651372)
    #4 zlog_file_cycle lib/zlog_targets.c:229 (libfrr.so.0+0x651f25)
    #5 zlog_file_set_other lib/zlog_targets.c:237 (libfrr.so.0+0x651ffe)
    #6 clear_log_cmdline lib/log_vty.c:451 (libfrr.so.0+0x3ee267)
    #7 cmd_execute_command_real lib/command.c:1002 (libfrr.so.0+0x2b8480)
    #8 cmd_execute_command lib/command.c:1061 (libfrr.so.0+0x2b8987)
    #9 cmd_execute lib/command.c:1227 (libfrr.so.0+0x2b98de)
    #10 vty_command lib/vty.c:616 (libfrr.so.0+0x5b3580)
    #11 vty_execute lib/vty.c:1379 (libfrr.so.0+0x5bc8d9)
    #12 vtysh_read lib/vty.c:2374 (libfrr.so.0+0x5c554c)
    #13 event_call lib/event.c:2013 (libfrr.so.0+0x5984e1)
    #14 frr_run lib/libfrr.c:1217 (libfrr.so.0+0x3b36a5)
    #15 main bgpd/bgp_main.c:551 (bgpd+0x49baa2)

  Mutex M21 (0x7fbe9f9ee468) created at:
    #0 pthread_mutex_lock ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4324 (libtsan.so.2+0x59bbf)
    #1 _frr_mtx_lock lib/frr_pthread.h:255 (libfrr.so.0+0x64ea33)
    #2 zlog_file_set_filename lib/zlog_targets.c:243 (libfrr.so.0+0x6520c7)
    #3 set_log_file lib/log_vty.c:371 (libfrr.so.0+0x3edd5a)
    #4 command_setup_early_logging lib/log_vty.c:419 (libfrr.so.0+0x3ee01a)
    #5 frr_init lib/libfrr.c:745 (libfrr.so.0+0x3b00ac)
    #6 main bgpd/bgp_main.c:513 (bgpd+0x49b382)

  Thread T1 'RCU sweeper' (tid=84574, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1001 (libtsan.so.2+0x63a59)
    #1 rcu_start lib/frrcu.c:354 (libfrr.so.0+0x348afb)
    #2 rcu_thread_prepare lib/frrcu.c:182 (libfrr.so.0+0x347d97)
    #3 frr_pthread_run lib/frr_pthread.c:196 (libfrr.so.0+0x35308a)
    #4 bgp_pthreads_run bgpd/bgpd.c:8500 (bgpd+0x9dd5c3)
    #5 main bgpd/bgp_main.c:550 (bgpd+0x49ba2d)

SUMMARY: ThreadSanitizer: data race lib/memory.c:130 in qfree

Version

master

How to reproduce

I am running configure with --enable-thread-sanitizer`` and--enable-undefined-sanitizer`

Expected behavior

FRR should not trigger any issues

Actual behavior

FRR triggers issues

Additional context

No response

Checklist

eqvinox commented 3 months ago

First one:

WARNING: ThreadSanitizer: data race (pid=84571)
  Atomic read of size 4 at 0x7b200000bd18 by thread T1:
    #0 seqlock_timedwait lib/seqlock.c:204 (libfrr.so.0+0x515d55)
    #1 rcu_main lib/frrcu.c:437 (libfrr.so.0+0x349397)

  Previous write of size 8 at 0x7b200000bd18 by main thread:
    [failed to restore the stack]

I have no idea what to do here. I'm willing to bet that the access by the main thread is in fact the initialization :disappointed:

eqvinox commented 3 months ago

Second one & third one seem to be identical:

WARNING: ThreadSanitizer: data race (pid=84571)
  Write of size 8 at 0x7b1800000020 by thread T1:
    #0 free ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:706 (libtsan.so.2+0x4a3fe)
    #1 qfree lib/memory.c:130 (libfrr.so.0+0x3f5b99)
    #2 rcu_do lib/frrcu.c:378 (libfrr.so.0+0x348eff)
    #3 rcu_main lib/frrcu.c:448 (libfrr.so.0+0x349581)

  Previous atomic write of size 8 at 0x7b1800000020 by main thread:
    #0 atomlist_add_tail lib/atomlist.c:76 (libfrr.so.0+0x29caf1)
    #1 rcu_heads_add_tail lib/frrcu.c:49 (libfrr.so.0+0x347435)
    #2 rcu_bump lib/frrcu.c:255 (libfrr.so.0+0x348233)
    #3 rcu_bump_maybe lib/frrcu.c:295 (libfrr.so.0+0x348383)
    #4 rcu_read_unlock lib/frrcu.c:320 (libfrr.so.0+0x348795)
    #5 fd_poll lib/event.c:894 (libfrr.so.0+0x58d1c3)
    #6 event_fetch lib/event.c:1855 (libfrr.so.0+0x59759a)
    #7 frr_run lib/libfrr.c:1216 (libfrr.so.0+0x3b36be)
    #8 main bgpd/bgp_main.c:551 (bgpd+0x49baa2)

Er… the data structure is being freed and that's detected as a race? I'm randomly guessing but maybe it fails to understand that those bytes are no longer referenced from multiple threads at that point?