sctplab / usrsctp

A portable SCTP userland stack
BSD 3-Clause "New" or "Revised" License
669 stars 279 forks source link

Aborted with sodealloc(): so_count -58769387 -- suspected use-after-free in sctp_timeout_handler #676

Open JonathanLennox opened 1 year ago

JonathanLennox commented 1 year ago

This is the same environment as https://github.com/sctplab/usrsctp/issues/673, and I suspect the same root cause, but a different crash - perhaps this one will give more information.

I am getting an assert crash in usrsctp with the assertion

sodealloc(): so_count -58769387

which certainly looks like the socket object has been freed and re-used by another allocation. In my logs I can see that usrsctp_close was called just before the crash.

The crash is in the sctp_timeout_handler, and the stack is

Thread 1 (Thread 0x7fc75affd640 (LWP 7273)):
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140494201935424) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140494201935424) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140494201935424, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007fc88d788476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007fc88d76e7f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007fc760e089c5 in terminate_non_graceful () at ./user_environment.h:100
#6  0x00007fc760e094d5 in sodealloc (so=0x7fc7f4015cf0) at user_socket.c:230
#7  0x00007fc760e09ef4 in sofree (so=0x7fc7f4015cf0) at user_socket.c:302
#8  0x00007fc760eba401 in sctp_timeout_handler (t=0x7fc7c0009538) at netinet/sctputil.c:2218
#9  0x00007fc760e1600c in sctp_handle_tick (elapsed_ticks=10) at netinet/sctp_callout.c:172
#10 0x00007fc760e16258 in user_sctp_timer_iterate (arg=0x0) at netinet/sctp_callout.c:214
#11 0x00007fc88d7dab43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#12 0x00007fc88d86ca00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

The state inside sctp_timeout_handler is

#8  0x00007fc760eba401 in sctp_timeout_handler (t=0x7fc7c0009538) at netinet/sctputil.c:2218
2218            sorele(upcall_socket);
(gdb) info locals
tv = {tv_sec = 1683580305, tv_usec = 140499344030605}
inp = 0x7fc7c0022b80
stcb = 0x7fc7c0009460
net = 0x0
tmr = 0x7fc7c0009538
op_err = 0x7fc75affcdf0
upcall_socket = 0x7fc7f4015cf0
type = 3
i = 8
secret = 0
did_output = true
released_asoc_reference = true
__func__ = "sctp_timeout_handler"

I suspect it's not very useful because the memory has probably been re-used, but in case there's anything meaningful still lingering in later fields, the socket object is

(gdb) p *so
$10 = {so_count = -58769387, so_type = 7, so_options = 0, so_linger = 9048, so_state = -386, so_qstate = -1892342028, so_pcb = 0x0, 
  so_dom = 123, so_head = 0x0, so_incomp = {tqh_first = 0x0, tqh_last = 0x0}, so_comp = {tqh_first = 0x0, tqh_last = 0x0}, so_list = {
    tqe_next = 0x0, tqe_prev = 0x7fc7a41b4208}, so_qlen = 0, so_incqlen = 0, so_qlimit = 0, so_timeo = 0, timeo_cond = {__data = {{__wseq = 0, 
        __wseq32 = {__low = 0, __high = 0}}, {__g1_start = 0, __g1_start32 = {__low = 0, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0}, 
      __g1_orig_size = 0, __wrefs = 4, __g_signals = {0, 0}}, 
    __size = '\000' <repeats 36 times>, "\004\000\000\000\000\000\000\000\000\000\000", __align = 0}, so_error = 0, so_sigio = 0x0, 
  so_oobmark = 0, so_aiojobq = {tqh_first = 0x0, tqh_last = 0x7fc7f4015d98}, so_rcv = {sb_cond = {__data = {{__wseq = 0, __wseq32 = {__low = 0, 
            __high = 0}}, {__g1_start = 0, __g1_start32 = {__low = 0, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0}, __g1_orig_size = 0, 
        __wrefs = 4, __g_signals = {0, 0}}, __size = '\000' <repeats 36 times>, "\004\000\000\000\000\000\000\000\000\000\000", __align = 0}, 
    sb_mtx = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = -1, __spins = 0, __elision = 0, __list = {__prev = 0x0, 
          __next = 0x0}}, __size = '\000' <repeats 16 times>, "\377\377\377\377", '\000' <repeats 19 times>, __align = 0}, sb_state = 0, 
    sb_mb = 0x0, sb_mbtail = 0x0, sb_lastrecord = 0x0, sb_sndptr = 0x0, sb_sndptroff = 0, sb_cc = 0, sb_hiwat = 131072, sb_mbcnt = 0, 
    sb_mbmax = 262144, sb_ctl = 0, sb_lowat = 1, sb_timeo = 0, sb_flags = 0}, so_snd = {sb_cond = {__data = {{__wseq = 0, __wseq32 = {
            __low = 0, __high = 0}}, {__g1_start = 0, __g1_start32 = {__low = 0, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0}, 
        __g1_orig_size = 0, __wrefs = 4, __g_signals = {0, 0}}, 
      __size = '\000' <repeats 36 times>, "\004\000\000\000\000\000\000\000\000\000\000", __align = 0}, sb_mtx = {__data = {__lock = 0, 
        __count = 0, __owner = 0, __nusers = 0, __kind = -1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 16 times>, "\377\377\377\377", '\000' <repeats 19 times>, __align = 0}, sb_state = 0, sb_mb = 0x0, 
    sb_mbtail = 0x0, sb_lastrecord = 0x0, sb_sndptr = 0x0, sb_sndptroff = 0, sb_cc = 0, sb_hiwat = 262144, sb_mbcnt = 0, sb_mbmax = 262144, 
    sb_ctl = 0, sb_lowat = 2048, sb_timeo = 0, sb_flags = 0}, so_upcall = 0x0, so_upcallarg = 0x0, so_cred = 0x0, so_label = 0x0, 
  so_peerlabel = 0x0, so_gencnt = 0, so_emuldata = 0x0, so_accf = 0x0}

In case it's useful, here are the dereferences of the local pointers referenced by sctp_timeout_handler:

(gdb) p *inp
$1 = {ip_inp = {inp = {inp_hash = {le_next = 0x0, le_prev = 0x0}, inp_list = {le_next = 0x0, le_prev = 0x0}, inp_ppcb = 0x0, inp_pcbinfo = 0x0, 
      inp_socket = 0x7fc7f4015cf0, inp_flow = 0, inp_flags = 0, inp_vflag = 128 '\200', inp_ip_ttl = 64 '@', inp_ip_p = 0 '\000', 
      inp_ip_minttl = 0 '\000', inp_ispare1 = 0, inp_pspare = {0x0, 0x0}, inp_inc = {inc_flags = 0 '\000', inc_len = 0 '\000', inc_pad = 0, 
        inc_ie = {ie_fport = 0, ie_lport = 34835, ie_dependfaddr = {ie46_foreign = {ia46_pad32 = {0, 0, 0}, ia46_addr4 = {s_addr = 0}}, 
            ie6_foreign = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 
                  0}}}}, ie_dependladdr = {ie46_local = {ia46_pad32 = {0, 0, 0}, ia46_addr4 = {s_addr = 0}}, ie6_local = {__in6_u = {
                __u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}}, 
      inp_label = 0x0, inp_sp = 0x0, inp_depend4 = {inp4_ip_tos = 0 '\000', inp4_options = 0x0, inp4_moptions = 0x0}, inp_depend6 = {
        inp6_options = 0x0, inp6_outputopts = 0x0, inp6_icmp6filt = 0x0, inp6_cksum = 0, inp6_hops = 0}, inp_portlist = {le_next = 0x0, 
        le_prev = 0x0}, inp_phd = 0x0, inp_mtx = {dummy = 0}}, 
    align = '\000' <repeats 48 times>, "\360\\\001\364\307\177\000\000\000\000\000\000\000\000\000\000\200@", '\000' <repeats 28 times>, "\023\210", '\000' <repeats 159 times>}, read_queue = {tqh_first = 0x0, tqh_last = 0x7fc7c0022c80}, sctp_list = {le_next = 0x7fc7b0022670, 
    le_prev = 0x7fc8340b6380}, sctp_hash = {le_next = 0x7fc7b0022670, le_prev = 0x7fc8340b6390}, laddr_count = 1, sctp_addr_list = {
    lh_first = 0x7fc7c016c8e0}, next_addr_touse = 0x0, sctp_socket = 0x0, sctp_features = 2215123240, sctp_flags = 2429026322, 
  sctp_mobility_features = 0, sctp_ep = {time_of_secret_change = 1683577699, secret_key = {{947375579, 2783592897, 2451956313, 2823807663, 
        1780161303, 3981685995, 1671705323, 3230344173}, {0, 0, 0, 0, 0, 0, 0, 0}}, size_of_a_cookie = 196, sctp_timeoutticks = {1000, 200, 
      1000, 30000, 600000, 0, 3600000}, sctp_minrto = 1000, sctp_maxrto = 60000, initial_rto = 1000, initial_init_rto_max = 60000, 
    sctp_sack_freq = 2, sctp_sws_sender = 1420, sctp_sws_receiver = 3000, sctp_default_cc_module = 0, sctp_default_ss_module = 0, 
    shared_keys = {lh_first = 0x7fc7c0030180}, local_auth_chunks = 0x7fc7c016c4d0, local_hmacs = 0x7fc7c0049f30, default_keyid = 0, 
    default_mtu = 0, max_init_times = 8, max_send_times = 10, def_net_failure = 5, def_net_pf_threshold = 65535, pre_open_stream_count = 10, 
    max_open_streams_intome = 2048, random_counter = 20, random_numbers = "\261\370\276\002E\244ԿD\256N\261\210\251E8Ux\216\256", 
    random_store = "\203d\230\t\006\274\262ֹ⦰)\221\021\270yiަ", signature_change = {timer = {tqe = {
          tqe_next = 0x7fc760f20e48 <system_base_info+712>, tqe_prev = 0x7fc8580cfa30}, c_time = 28781530, c_arg = 0x7fc7c0022dc0, 
        c_func = 0x7fc760eb7c9f <sctp_timeout_handler>, c_flags = 6}, type = 14, ep = 0x7fc7c0022b80, tcb = 0x0, net = 0x0, 
      self = 0x7fc7c0022dc0, ticks = 28781510, stopped_from = 0}, def_cookie_life = 60000, auto_close_time = 0, initial_sequence_debug = 0, 
    adaptation_layer_indicator = 0, adaptation_layer_indicator_provided = 0 '\000', store_at = 16, max_burst = 4, fr_max_burst = 4, 
    default_flowlabel = 0, default_dscp = 0 '\000', current_secret_number = 0 '\000', last_secret_number = 0 '\000', port = 0}, 
  sctp_tcbhash = 0x0, sctp_hashmark = 255, sctp_asoc_list = {lh_first = 0x0}, sctp_frag_point = 0, partial_delivery_point = 65536, 
  sctp_context = 0, max_cwnd = 0, local_strreset_support = 1 '\001', sctp_cmt_on_off = 0, ecn_supported = 0 '\000', 
  prsctp_supported = 1 '\001', auth_supported = 1 '\001', idata_supported = 0 '\000', asconf_supported = 1 '\001', 
  reconfig_supported = 1 '\001', nrsack_supported = 0 '\000', pktdrop_supported = 0 '\000', def_send = {sinfo_stream = 0, sinfo_ssn = 0, 
    sinfo_flags = 0, sinfo_ppid = 0, sinfo_context = 0, sinfo_timetolive = 0, sinfo_tsn = 0, sinfo_cumtsn = 0, sinfo_assoc_id = 0, 
    sinfo_keynumber = 0, sinfo_keynumber_valid = 0}, pkt = 0x0, pkt_last = 0x0, control = 0x0, inp_mtx = {__data = {__lock = 0, __count = 0, 
      __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
    __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 22 times>, __align = 0}, inp_create_mtx = {__data = {__lock = 0, __count = 0, 
      __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
    __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 22 times>, __align = 0}, inp_rdata_mtx = {__data = {__lock = 0, __count = 0, 
      __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
    __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 22 times>, __align = 0}, refcount = 2, def_vrf_id = 0, fibnum = 0, 
  total_sends = 1197, total_recvs = 303, last_abort_code = 536870919, total_nospaces = 0, sctp_asocidhash = 0x7fc7c0031090, 
  hashasocidmark = 31, sctp_associd_counter = 1, ulp_info = 0x907, recv_callback = 0x7fc760e085d0 <onSctpInboundPacket>, send_sb_threshold = 0, 
  send_callback = 0x0}
(gdb) p *stcb
$2 = {sctp_socket = 0x7fc7c016ced0, sctp_ep = 0x7fc7c0000090, sctp_tcbhash = {le_next = 0x0, le_prev = 0x0}, sctp_tcblist = {le_next = 0x0, 
    le_prev = 0x7fc7c0022e60}, sctp_tcbasocidhash = {le_next = 0x0, le_prev = 0x7fc7c00310a8}, sctp_asocs = {le_next = 0x0, 
    le_prev = 0x7fc8204c6af8}, block_entry = 0x0, asoc = {state = 0, asconf_queue = {tqh_first = 0x0, tqh_last = 0x7fc7c00094c0}, 
    time_entered = {tv_sec = 1683577700, tv_usec = 120986}, time_last_rcvd = {tv_sec = 1683580417, tv_usec = 261107}, time_last_sent = {
      tv_sec = 1683580416, tv_usec = 344616}, time_last_sat_advance = {tv_sec = 0, tv_usec = 0}, def_send = {sinfo_stream = 0, sinfo_ssn = 0, 
      sinfo_flags = 0, sinfo_ppid = 0, sinfo_context = 0, sinfo_timetolive = 0, sinfo_tsn = 0, sinfo_cumtsn = 0, sinfo_assoc_id = 0, 
      sinfo_keynumber = 0, sinfo_keynumber_valid = 0}, dack_timer = {timer = {tqe = {tqe_next = 0x7fc7f80048e0, tqe_prev = 0x7fc83804d120}, 
        c_time = 28781510, c_arg = 0x7fc7c0009538, c_func = 0x7fc760eb7c9f <sctp_timeout_handler>, c_flags = 0}, type = 3, ep = 0x7fc7c0022b80, 
      tcb = 0x7fc7c0009460, net = 0x0, self = 0x0, ticks = 28781310, stopped_from = 1610612754}, asconf_timer = {timer = {tqe = {
          tqe_next = 0x0, tqe_prev = 0x0}, c_time = 0, c_arg = 0x0, c_func = 0x0, c_flags = 0}, type = 0, ep = 0x0, tcb = 0x0, net = 0x0, 
      self = 0x0, ticks = 0, stopped_from = 1610612757}, strreset_timer = {timer = {tqe = {tqe_next = 0x0, tqe_prev = 0x0}, c_time = 0, 
        c_arg = 0x0, c_func = 0x0, c_flags = 0}, type = 0, ep = 0x0, tcb = 0x0, net = 0x0, self = 0x0, ticks = 0, stopped_from = 1610612756}, 
    shut_guard_timer = {timer = {tqe = {tqe_next = 0x0, tqe_prev = 0x0}, c_time = 0, c_arg = 0x0, c_func = 0x0, c_flags = 0}, type = 0, 
      ep = 0x0, tcb = 0x0, net = 0x0, self = 0x0, ticks = 0, stopped_from = 1610612759}, autoclose_timer = {timer = {tqe = {tqe_next = 0x0, 
          tqe_prev = 0x0}, c_time = 0, c_arg = 0x0, c_func = 0x0, c_flags = 0}, type = 0, ep = 0x0, tcb = 0x0, net = 0x0, self = 0x0, 
      ticks = 0, stopped_from = 1610612758}, delete_prim_timer = {timer = {tqe = {tqe_next = 0x0, tqe_prev = 0x0}, c_time = 0, c_arg = 0x0, 
        c_func = 0x0, c_flags = 0}, type = 0, ep = 0x0, tcb = 0x0, net = 0x0, self = 0x0, ticks = 0, stopped_from = 1610612760}, 
    sctp_restricted_addrs = {lh_first = 0x0}, asconf_addr_del_pending = 0x0, deleted_primary = 0x0, nets = {tqh_first = 0x0, 
      tqh_last = 0x7fc7c0009790}, free_chunks = {tqh_first = 0x0, tqh_last = 0x7fc7c00097a0}, control_send_queue = {tqh_first = 0x0, 
      tqh_last = 0x7fc7c00097b0}, asconf_send_queue = {tqh_first = 0x0, tqh_last = 0x7fc7c00097c0}, sent_queue = {tqh_first = 0x0, 
      tqh_last = 0x7fc7c00097d0}, send_queue = {tqh_first = 0x0, tqh_last = 0x7fc7c00097e0}, ss_data = {locked_on_sending = 0x0, 
      last_out_stream = 0x0, out = {wheel = {tqh_first = 0x0, tqh_last = 0x7fc7c0009800}, list = {tqh_first = 0x0, 
          tqh_last = 0x7fc7c0009800}}}, stcb_starting_point_for_iterator = 0x0, asconf_ack_sent = {tqh_first = 0x0, tqh_last = 0x7fc7c0009818}, 
    str_reset = 0x0, last_used_address = 0x7fc7c016c8e0, strmin = 0x0, strmout = 0x0, mapping_array = 0x0, 
    primary_destination = 0x7fc7c0139360, alternate = 0x0, last_net_cmt_send_started = 0x0, last_data_chunk_from = 0x7fc7c0139360, 
    last_control_chunk_from = 0x7fc7c0139360, resetHead = {tqh_first = 0x0, tqh_last = 0x7fc7c0009878}, pending_reply_queue = {tqh_first = 0x0, 
      tqh_last = 0x7fc7c0009888}, cc_functions = {sctp_set_initial_cc_param = 0x7fc760ee5785 <sctp_set_initial_cc_param>, 
      sctp_cwnd_update_after_sack = 0x7fc760ee7919 <sctp_cwnd_update_after_sack>, 
      sctp_cwnd_update_exit_pf = 0x7fc760ee6f8f <sctp_cwnd_update_exit_pf_common>, 
      sctp_cwnd_update_after_fr = 0x7fc760ee5912 <sctp_cwnd_update_after_fr>, 
      sctp_cwnd_update_after_timeout = 0x7fc760ee7010 <sctp_cwnd_update_after_timeout>, 
      sctp_cwnd_update_after_ecn_echo = 0x7fc760ee795c <sctp_cwnd_update_after_ecn_echo>, 
      sctp_cwnd_update_after_packet_dropped = 0x7fc760ee752d <sctp_cwnd_update_after_packet_dropped>, 
      sctp_cwnd_update_after_output = 0x7fc760ee7847 <sctp_cwnd_update_after_output>, sctp_cwnd_update_packet_transmitted = 0x0, 
      sctp_cwnd_update_tsn_acknowledged = 0x0, sctp_cwnd_new_transmission_begins = 0x0, sctp_cwnd_prepare_net_for_sack = 0x0, 
      sctp_cwnd_socket_option = 0x0, sctp_rtt_calculated = 0x0}, congestion_control_module = 0, ss_functions = {
      sctp_ss_init = 0x7fc760f00fef <sctp_ss_default_init>, sctp_ss_clear = 0x7fc760f0116c <sctp_ss_default_clear>, 
      sctp_ss_init_stream = 0x7fc760f0137e <sctp_ss_default_init_stream>, sctp_ss_add_to_stream = 0x7fc760f014a8 <sctp_ss_default_add>, 
      sctp_ss_is_empty = 0x7fc760f015ef <sctp_ss_default_is_empty>, sctp_ss_remove_from_stream = 0x7fc760f016d6 <sctp_ss_default_remove>, 
      sctp_ss_select_stream = 0x7fc760f018c6 <sctp_ss_default_select>, sctp_ss_scheduled = 0x7fc760f01cc0 <sctp_ss_default_scheduled>, 
      sctp_ss_packet_done = 0x7fc760f01f67 <sctp_ss_default_packet_done>, sctp_ss_get_value = 0x7fc760f02042 <sctp_ss_default_get_value>, 
      sctp_ss_set_value = 0x7fc760f02125 <sctp_ss_default_set_value>, 
      sctp_ss_is_user_msgs_incomplete = 0x7fc760f0220a <sctp_ss_default_is_user_msgs_incomplete>}, stream_scheduling_module = 0, vrf_id = 0, 
    cookie_preserve_req = 0, asconf_seq_out = 3650393057, asconf_seq_out_acked = 3650393056, asconf_seq_in = 1984230728, 
    str_reset_seq_out = 3650393057, str_reset_seq_in = 1984230729, my_vtag = 1680007203, peer_vtag = 1901069742, my_vtag_nonce = 3525829521, 
    peer_vtag_nonce = 1465789626, assoc_id = 3, smallest_mtu = 1280, this_sack_highest_gap = 3650394253, last_acked_seq = 3650394253, 
    sending_seq = 3650394254, init_seq_number = 3650393057, advanced_peer_ack_point = 3650394253, cumulative_tsn = 1984231031, 
    mapping_array_base_tsn = 1984231025, highest_tsn_inside_map = 1984231024, nr_mapping_array = 0x0, highest_tsn_inside_nr_map = 1984231031, 
    fast_recovery_tsn = 0, sat_t3_recovery_tsn = 3650394200, tsn_last_delivered = 1984230728, control_pdapi = 0x0, 
--Type <RET> for more, q to quit, c to continue without paging--
    tsn_of_pdapi_last_delivered = 0, pdapi_ppid = 0, context = 0, last_reset_action = {0, 0}, last_sending_seq = {0, 0}, last_base_tsnsent = {
      0, 0}, peers_rwnd = 4718592, my_rwnd = 131072, my_last_reported_rwnd = 131072, sctp_frag_point = 0, total_output_queue_size = 0, 
    sb_cc = 0, sb_send_resv = 0, my_rwnd_control_len = 0, default_flowlabel = 0, pr_sctp_cnt = 0, ctrl_queue_cnt = 0, stream_queue_cnt = 0, 
    send_queue_cnt = 0, sent_queue_cnt = 0, sent_queue_cnt_removeable = 0, sent_queue_retran_cnt = 0, size_on_reasm_queue = 0, 
    cnt_on_reasm_queue = 0, fwd_tsn_cnt = 0, total_flight = 0, total_flight_count = 0, numnets = 1, overall_error_count = 0, cnt_msg_on_sb = 0, 
    size_on_all_streams = 0, cnt_on_all_streams = 0, heart_beat_delay = 30000, sctp_autoclose_ticks = 0, pre_open_streams = 10, 
    max_inbound_streams = 2048, cookie_life = 60000, delayed_ack = 200, old_delayed_ack = 0, sack_freq = 2, data_pkts_seen = 0, numduptsns = 0, 
    dup_tsns = {1984230998, 0 <repeats 19 times>}, initial_init_rto_max = 60000, initial_rto = 1000, minrto = 1000, maxrto = 60000, 
    local_auth_chunks = 0x7fc7c00416e0, peer_auth_chunks = 0x0, local_hmacs = 0x7fc7c00d7310, peer_hmacs = 0x0, shared_keys = {lh_first = 0x0}, 
    authinfo = {random = 0x7fc7f4033fa0, random_len = 32, peer_random = 0x7fc7c002fa50, assoc_key = 0x0, recv_key = 0x0, active_keyid = 0, 
      assoc_keyid = 0, recv_keyid = 0}, refcnt = 0, chunks_on_out_queue = 0, peers_adaptation = 0, default_mtu = 0, peer_hmac_id = 0, 
    stale_cookie_count = 0, str_of_pdapi = 0, ssn_of_pdapi = 0, streamincnt = 0, streamoutcnt = 0, strm_realoutsize = 0, 
    strm_pending_add_size = 0, max_init_times = 8, max_send_times = 10, def_net_failure = 5, def_net_pf_threshold = 65535, 
    mapping_array_size = 16, last_strm_seq_delivered = 0, last_strm_no_delivered = 0, last_revoke_count = 0, num_send_timers_up = 0, 
    stream_locked_on = 0, ecn_echo_cnt_onq = 0, free_chunk_cnt = 0, stream_locked = 0 '\000', authenticated = 0 '\000', send_sack = 0 '\000', 
    max_burst = 4, fr_max_burst = 4, sat_network = 0 '\000', sat_network_lockout = 1 '\001', burst_limit_applied = 0 '\000', 
    hb_random_values = "\000\000\000", fragmented_delivery_inprogress = 0 '\000', fragment_flags = 0 '\000', last_flags_delivered = 0 '\000', 
    hb_ect_randombit = 0 '\000', hb_random_idx = 4 '\004', default_dscp = 0 '\000', asconf_del_pending = 0 '\000', trigger_reset = 0 '\000', 
    ecn_supported = 0 '\000', prsctp_supported = 1 '\001', auth_supported = 0 '\000', asconf_supported = 0 '\000', 
    reconfig_supported = 1 '\001', nrsack_supported = 0 '\000', pktdrop_supported = 0 '\000', idata_supported = 0 '\000', 
    peer_req_out = 0 '\000', local_strreset_support = 1 '\001', peer_supports_nat = 0 '\000', scope = {ipv4_addr_legal = 0 '\000', 
      ipv6_addr_legal = 0 '\000', conn_addr_legal = 1 '\001', loopback_scope = 0 '\000', ipv4_local_scope = 0 '\000', local_scope = 0 '\000', 
      site_scope = 0 '\000'}, used_alt_asconfack = 0 '\000', fast_retran_loss_recovery = 0 '\000', sat_t3_loss_recovery = 1 '\001', 
    dropped_special_cnt = 0 '\000', seen_a_sack_this_pkt = 0 '\000', stream_reset_outstanding = 0 '\000', 
    stream_reset_out_is_outstanding = 0 '\000', delayed_connection = 0 '\000', ifp_had_enobuf = 0 '\000', saw_sack_with_frags = 0 '\000', 
    saw_sack_with_nr_frags = 0 '\000', in_asocid_hash = 1 '\001', assoc_up_sent = 1 '\001', adaptation_needed = 0 '\000', 
    adaptation_sent = 0 '\000', cmt_dac_pkts_rcvd = 50 '2', sctp_cmt_on_off = 0 '\000', iam_blocking = 0 '\000', 
    cookie_how = "\000\000\000\000\000\000\000", sctp_cmt_pf = 0 '\000', use_precise_time = 0 '\000', sctp_features = 2215123240, max_cwnd = 0, 
    port = 0, marked_retrans = 14, timoinit = 0, timodata = 14, timosack = 35, timoshutdown = 0, timoheartbeat = 86, timocookie = 0, 
    timoshutdownack = 0, start_time = {tv_sec = 1683577700, tv_usec = 120897}, discontinuity_time = {tv_sec = 1683577700, tv_usec = 120897}, 
    abandoned_unsent = {0, 0, 0, 0}, abandoned_sent = {0, 0, 0, 0}}, freed_by_sorcv_sincelast = 0, total_sends = 1197, total_recvs = 303, 
  freed_from_where = 536870920, rport = 34835, resv = 0, tcb_mtx = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = -1, 
      __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
    __size = '\000' <repeats 16 times>, "\377\377\377\377", '\000' <repeats 19 times>, __align = 0}}
(gdb) p *tmr
$3 = {timer = {tqe = {tqe_next = 0x7fc7f80048e0, tqe_prev = 0x7fc83804d120}, c_time = 28781510, c_arg = 0x7fc7c0009538, 
    c_func = 0x7fc760eb7c9f <sctp_timeout_handler>, c_flags = 0}, type = 3, ep = 0x7fc7c0022b80, tcb = 0x7fc7c0009460, net = 0x0, self = 0x0, 
  ticks = 28781310, stopped_from = 1610612754}
(gdb) p *op_err
$4 = {m_hdr = {mh_next = 0x7fc760eb7c9f <sctp_timeout_handler>, mh_nextpkt = 0x7fc7c0009538, mh_data = 0x7fc75affce50 "", mh_len = 1625383512, 
    mh_flags = 32711, mh_type = 0, pad = "\000\000\000\000\000"}, M_dat = {MH = {MH_pkthdr = {rcvif = 0x0, header = 0x0, len = 10000000, 
        csum_flags = 0, csum_data = 0, tso_segsz = 0, ether_vtag = 0, tags = {slh_first = 0x989680}}, MH_dat = {MH_ext = {ext_buf = 0x0, 
          ext_free = 0xbe77eb1f10b8bf00, ext_args = 0x0, ext_size = 2373823299, ref_cnt = 0x0, ext_type = 1526715968}, 
        MH_databuf = "\000\000\000\000\000\000\000\000\000\277\270\020\037\353w\276\000\000\000\000\000\000\000\000C\253}\215\310\177\000\000\000\000\000\000\000\000\000\000@\326\377Z\307\177\000\000\320\030>U\310\177\000\000\203c\v\255\250\232f\267@\326\377Z\307\177\000\000\001\000\000\000\000\000\000\000P\250}\215\310\177\000\000\060\032>U\310\177\000\000\203c\313\061W/\350H\203c\211\377S\200\367H", '\000' <repeats 41 times>, "\277\270\020\037\353w\276\000\000\000\000\000\000\000\000\320\030>U\310\177\000"}}, 
    M_databuf = '\000' <repeats 16 times>, "\200\226\230", '\000' <repeats 13 times>, "\200\226\230", '\000' <repeats 14 times>, "\277\270\020\037\353w\276\000\000\000\000\000\000\000\000C\253}\215\310\177\000\000\000\000\000\000\000\000\000\000@\326\377Z\307\177\000\000\320\030>U\310\177\000\000\203c\v\255\250\232f\267@\326\377Z\307\177\000\000\001\000\000\000\000\000\000\000P\250}\215\310\177\000\000\060\032>U\310\177\000\000\203c\313\061W/\350H\203c\211\377S\200\367H", '\000' <repeats 41 times>, "\277\270\020\037\353w\276"...}}
(gdb) p *stcb->sctp_ep
$5 = {ip_inp = {inp = {inp_hash = {le_next = 0x7fc7c017bd70, le_prev = 0x7fc7c00168c0}, inp_list = {le_next = 0x7fc7c0009450, 
        le_prev = 0x7fc7c000b910}, inp_ppcb = 0x7fc7c0086280, inp_pcbinfo = 0x7fc7c000f3a0, inp_socket = 0x7fc7c0043420, inp_flow = 3221417744, 
      inp_flags = 32711, inp_vflag = 192 '\300', inp_ip_ttl = 156 '\234', inp_ip_p = 13 '\r', inp_ip_minttl = 192 '\300', inp_ispare1 = 32711, 
      inp_pspare = {0x7fc7c00df1b0, 0x7fc7c002cd30}, inp_inc = {inc_flags = 48 '0', inc_len = 205 '\315', inc_pad = 49154, inc_ie = {
          ie_fport = 32711, ie_lport = 0, ie_dependfaddr = {ie46_foreign = {ia46_pad32 = {3221225696, 32711, 3221225696}, ia46_addr4 = {
                s_addr = 32711}}, ie6_foreign = {__in6_u = {__u6_addr8 = "\340\000\000\300\307\177\000\000\340\000\000\300\307\177\000", 
                __u6_addr16 = {224, 49152, 32711, 0, 224, 49152, 32711, 0}, __u6_addr32 = {3221225696, 32711, 3221225696, 32711}}}}, 
          ie_dependladdr = {ie46_local = {ia46_pad32 = {3221411984, 32711, 3221411984}, ia46_addr4 = {s_addr = 32711}}, ie6_local = {__in6_u = {
                __u6_addr8 = "\220\330\002\300\307\177\000\000\220\330\002\300\307\177\000", __u6_addr16 = {55440, 49154, 32711, 0, 55440, 
                  49154, 32711, 0}, __u6_addr32 = {3221411984, 32711, 3221411984, 32711}}}}}}, inp_label = 0x7fc7c0000100, 
      inp_sp = 0x7fc7c0000100, inp_depend4 = {inp4_ip_tos = 16 '\020', inp4_options = 0x7fc7c0000110, inp4_moptions = 0x7fc7c0000120}, 
      inp_depend6 = {inp6_options = 0x7fc7c0000120, inp6_outputopts = 0x7fc7c0000130, inp6_icmp6filt = 0x7fc7c0000130, 
        inp6_cksum = -1073741504, inp6_hops = 32711}, inp_portlist = {le_next = 0x7fc7c0000140, le_prev = 0x7fc7c0000150}, 
      inp_phd = 0x7fc7c0000150, inp_mtx = {dummy = -1073546000}}, 
    align = "p\275\027\300\307\177\000\000\300h\001\300\307\177\000\000P\224\000\300\307\177\000\000\020\271\000\300\307\177\000\000\200b\b\300\307\177\000\000\240\363\000\300\307\177\000\000 4\004\300\307\177\000\000\020\357\002\300\307\177\000\000\300\234\r\300\307\177\000\000\260\361\r\300\307\177\000\000\060\315\002\300\307\177\000\000\060\315\002\300\307\177\000\000\340\000\000\300\307\177\000\000\340\000\000\300\307\177\000\000\220\330\002\300\307\177\000\000\220\330\002\300\307\177\000\000\000\001\000\300\307\177\000\000\000\001\000\300\307\177\000\000\020\001\000\300\307\177\000\000\020\001\000\300\307\177\000\000 \001\000\300\307\177\000\000 \001\000\300\307\177\000\000\060\001\000\300\307\177\000\000\060\001\000\300\307\177\000\000@\001\000\300\307\177\000\000"...}, read_queue = {tqh_first = 0x7fc7c0000180, tqh_last = 0x7fc7c0000180}, 
  sctp_list = {le_next = 0x7fc7c0000190, le_prev = 0x7fc7c0000190}, sctp_hash = {le_next = 0x7fc7c00001a0, le_prev = 0x7fc7c00001a0}, 
  laddr_count = -1073741392, sctp_addr_list = {lh_first = 0x7fc7c00001b0}, next_addr_touse = 0x7fc7c00001c0, sctp_socket = 0x7fc7c00001c0, 
  sctp_features = 140495897941776, sctp_flags = 3222722320, sctp_mobility_features = 32711, sctp_ep = {time_of_secret_change = 3221394096, 
    secret_key = {{32711, 3221394096, 32711, 3221225968, 32711, 3221225968, 32711, 3221225984}, {32711, 3221225984, 32711, 3221226000, 32711, 
        3221226000, 32711, 3221226016}}, size_of_a_cookie = 32711, sctp_timeoutticks = {3221226016, 32711, 3221410992, 32711, 3221410992, 
      32711, 3221226048}, sctp_minrto = 32711, sctp_maxrto = 3221226048, initial_rto = 32711, initial_init_rto_max = 3221287296, 
    sctp_sack_freq = 32711, sctp_sws_sender = 3221775632, sctp_sws_receiver = 32711, sctp_default_cc_module = 3221226080, 
    sctp_default_ss_module = 32711, shared_keys = {lh_first = 0x7fc7c0000260}, local_auth_chunks = 0x7fc7c0000270, 
    local_hmacs = 0x7fc7c0000270, default_keyid = 640, default_mtu = 32711, max_init_times = 640, max_send_times = 49152, 
    def_net_failure = 32711, def_net_pf_threshold = 0, pre_open_stream_count = 656, max_open_streams_intome = 49152, random_counter = 32711, 
    random_numbers = "\220\002\000\300\307\177\000\000\240\002\000\300\307\177\000\000\240\002\000\300", 
    random_store = "\307\177\000\000\260\002\000\300\307\177\000\000\260\002\000\300\307\177\000", signature_change = {timer = {tqe = {
          tqe_next = 0x7fc7c00002c0, tqe_prev = 0x7fc7c00002c0}, c_time = 3221226192, c_arg = 0x7fc7c00002d0, c_func = 0x7fc7c00002e0, 
        c_flags = -1073741088}, type = -1073741072, ep = 0x7fc7c00002f0, tcb = 0x7fc7c0000300, net = 0x7fc7c0000300, self = 0x7fc7c0000310, 
      ticks = 3221226256, stopped_from = 32711}, def_cookie_life = 3221226272, auto_close_time = 32711, initial_sequence_debug = 3221226272, 
    adaptation_layer_indicator = 32711, adaptation_layer_indicator_provided = 96 '`', store_at = 32711, max_burst = 3221434976, 
    fr_max_burst = 32711, default_flowlabel = 3221226304, default_dscp = 199 '\307', current_secret_number = 127 '\177', 
    last_secret_number = 0 '\000', port = 832}, sctp_tcbhash = 0x7fc7c0000350, sctp_hashmark = 140495896445776, sctp_asoc_list = {
    lh_first = 0x7fc7c0000360}, sctp_frag_point = 3221226336, partial_delivery_point = 32711, sctp_context = 3221226352, max_cwnd = 32711, 
  local_strreset_support = 112 'p', sctp_cmt_on_off = 32711, ecn_supported = 128 '\200', prsctp_supported = 3 '\003', 
  auth_supported = 0 '\000', idata_supported = 192 '\300', asconf_supported = 199 '\307', reconfig_supported = 127 '\177', 
  nrsack_supported = 0 '\000', pktdrop_supported = 0 '\000', def_send = {sinfo_stream = 896, sinfo_ssn = 49152, sinfo_flags = 32711, 
    sinfo_ppid = 3222106960, sinfo_context = 32711, sinfo_timetolive = 3222106960, sinfo_tsn = 32711, sinfo_cumtsn = 3221226400, 
    sinfo_assoc_id = 32711, sinfo_keynumber = 928, sinfo_keynumber_valid = 49152}, pkt = 0x7fc7c0005090, pkt_last = 0x7fc7c00869a0, 
  control = 0x7fc7c00003c0, inp_mtx = {__data = {__lock = -1073740864, __count = 32711, __owner = -1073740848, __nusers = 32711, 
      __kind = -1073740848, __spins = 32711, __elision = 0, __list = {__prev = 0x7fc7c0138470, __next = 0x7fc7c0024110}}, 
    __size = "\300\003\000\300\307\177\000\000\320\003\000\300\307\177\000\000\320\003\000\300\307\177\000\000p\204\023\300\307\177\000\000\020A\002\300\307\177\000", __align = 140495896445888}, inp_create_mtx = {__data = {__lock = -1073740816, __count = 32711, __owner = -1073740816, 
      __nusers = 32711, __kind = -1072400240, __spins = 32711, __elision = 0, __list = {__prev = 0x7fc7c0147890, __next = 0x7fc7c0000410}}, 
--Type <RET> for more, q to quit, c to continue without paging--
    __size = "\360\003\000\300\307\177\000\000\360\003\000\300\307\177\000\000\220x\024\300\307\177\000\000\220x\024\300\307\177\000\000\020\004\000\300\307\177\000", __align = 140495896445936}, inp_rdata_mtx = {__data = {__lock = -1073740784, __count = 32711, __owner = -1073673024, 
      __nusers = 32711, __kind = -1073592336, __spins = 32711, __elision = 0, __list = {__prev = 0x7fc7c00db700, __next = 0x7fc7c00db700}}, 
    __size = "\020\004\000\300\307\177\000\000\300\f\001\300\307\177\000\000\360G\002\300\307\177\000\000\000\267\r\300\307\177\000\000\000\267\r\300\307\177\000", __align = 140495896445968}, refcount = -1073740736, def_vrf_id = 32711, fibnum = 1088, total_sends = 32711, 
  total_recvs = 3221260592, last_abort_code = 32711, total_nospaces = 3222137728, sctp_asocidhash = 0x7fc7c003b240, 
  hashasocidmark = 140495896687168, sctp_associd_counter = 3221226608, ulp_info = 0x7fc7c0000470, recv_callback = 0x7fc7c0035b80, 
  send_sb_threshold = 3221437696, send_callback = 0x7fc7c00262d0}
tbeloqui commented 1 year ago

I have reproduced the same exact issue in a test suite with Gstreamers usrsctplib version, in my case the sctp-timer thread ticks with a SCTP_TIMER_TYPE_SHUTDOWN event, decrements a reference on the socket and eventually calls sofree() which triggers this same assert:

    KASSERT(so->so_count == 0, ("sodealloc(): so_count %d", so->so_count));

@tuexen is there anything else we can help you with to figure out this issue? I can reproduce it very often with a test that stresses the setup and shutdown of the sockets, so it won't be a problem to dig out more information.

JonathanLennox commented 5 months ago

@tuexen : I have a hypothesis here.

The userspace code in sctp_close locks SCTP_INP_WLOCK when it sets SCTP_PCB_FLAGS_SOCKET_GONE in sctp_flags; however, the userspace code to set upcall_socket in sctp_timeout_handler checks that bit in sctp_flags without, as far as I can tell, acquiring that lock. Thus, I suspect, under load, the sctp_timeout_handler thread can get suspended immediately after the check of SCTP_PCB_FLAGS_SOCKET_GONE, and a usrsctp_close call can sneak in.

However, I don't understand the lock hierarchy here. Would it be safe to lock SCTP_INP in sctp_timeout_handler, or could that cause a lock inversion?