apache / trafficserver

Apache Traffic Server™ is a fast, scalable and extensible HTTP/1.1 and HTTP/2 compliant caching proxy server.
https://trafficserver.apache.org/
Apache License 2.0
1.82k stars 804 forks source link

Crash in Http2ConnectionState::_get_configured_receive_session_window_size #10504

Open bneradt opened 1 year ago

bneradt commented 1 year ago

We noticed the following ATS 10 crash in production. It's a nullptr dereference that would come up every 10-30 minutes on a box running about 1,800 rps.

(gdb) bt                                                                                                                                                       
#0  0x000000000063135c in NetVConnection::_get_service (service=NetVConnection::Service::TLS_SNI, this=0x0) at /sd/workspace/src/git.ouryahoo.com/Edge/build/_build/build_release_posix-x86_64_gcc_10/trafficserver10.0/build/../../../../_scm/trafficserver10.0/iocore/net/I_NetVConnection.h:653
#1  NetVConnection::get_service<TLSSNISupport> (this=0x0) at /sd/workspace/src/git.ouryahoo.com/Edge/build/_build/build_release_posix-x86_64_gcc_10/trafficserver10.0/build/../../../../_scm/trafficserver10.0/iocore/net/I_NetVConnection.h:653
#2  Http2ConnectionState::_get_configured_initial_window_size (this=<optimized out>, this=<optimized out>) at ../../../../../../_scm/trafficserver10.0/proxy/http2/Http2ConnectionState.cc:1109
#3  Http2ConnectionState::_get_configured_initial_window_size (this=0x7f6da49a7310) at ../../../../../../_scm/trafficserver10.0/proxy/http2/Http2ConnectionState.cc:1101
#4  0x0000000000632fc5 in Http2ConnectionState::_get_configured_receive_session_window_size (this=0x7f6da49a7310) at ../../../../../../_scm/trafficserver10.0/proxy/http2/Http2ConnectionState.cc:2623
#5  Http2ConnectionState::_get_configured_receive_session_window_size (this=0x7f6da49a7310) at ../../../../../../_scm/trafficserver10.0/proxy/http2/Http2ConnectionState.cc:2616
#6  0x0000000000633c65 in Http2ConnectionState::restart_receiving (this=0x7f6da49a7310, stream=0x0) at ../../../../../../_scm/trafficserver10.0/proxy/http2/Http2ConnectionState.cc:1758
#7  0x0000000000652351 in Http2CommonSession::do_process_frame_read (this=0x7f6da49a7308, event=<optimized out>, vio=0x7f6cea94fa58, inside_frame=<optimized out>) at ../../../../../../_scm/trafficserver10.0/proxy/http2/Http2CommonSession.cc:391
#8  0x000000000064ed39 in Http2ClientSession::main_event_handler (this=0x7f6da49a7000, event=100, edata=0x7f6cea94fa58) at ../../../../../../_scm/trafficserver10.0/proxy/http2/Http2ClientSession.cc:179
#9  0x00000000007cef94 in Continuation::handleEvent (data=0x7f6cea94fa58, event=100, this=0x7f6da49a7000) at /sd/workspace/src/git.ouryahoo.com/Edge/build/_build/build_release_posix-x86_64_gcc_10/trafficserver10.0/build/../../../../_scm/trafficserver10.0/iocore/eventsystem/I_Continuation.h:228
#10 Continuation::handleEvent (data=0x7f6cea94fa58, event=100, this=0x7f6da49a7000) at /sd/workspace/src/git.ouryahoo.com/Edge/build/_build/build_release_posix-x86_64_gcc_10/trafficserver10.0/build/../../../../_scm/trafficserver10.0/iocore/eventsystem/I_Continuation.h:224
#11 read_signal_and_update (event=100, vc=0x7f6cea94f800) at ../../../../../../_scm/trafficserver10.0/iocore/net/UnixNetVConnection.cc:82
#12 0x0000000000799f6d in SSLNetVConnection::net_read_io (this=<optimized out>, nh=0x7f71901bf200, lthread=<optimized out>) at ../../../../../../_scm/trafficserver10.0/iocore/net/SSLNetVConnection.cc:730
#13 0x00000000007f9eff in NetHandler::process_ready_list (this=this@entry=0x7f71901bf200) at ../../../../../../_scm/trafficserver10.0/iocore/net/NetHandler.cc:252
#14 0x00000000007fb46a in NetHandler::waitForActivity (this=0x7f71901bf200, timeout=<optimized out>) at ../../../../../../_scm/trafficserver10.0/iocore/net/NetHandler.cc:340
#15 0x000000000082488b in EThread::execute_regular (this=this@entry=0x7f71901be680) at ../../../../../../_scm/trafficserver10.0/iocore/eventsystem/I_PriorityEventQueue.h:115
#16 0x00000000008249b6 in EThread::execute (this=0x7f71901be680) at ../../../../../../_scm/trafficserver10.0/iocore/eventsystem/UnixEThread.cc:334
#17 0x0000000000822eb2 in spawn_thread_internal (a=0x7f71988a2900) at ../../../../../../_scm/trafficserver10.0/iocore/eventsystem/Thread.cc:78
#18 0x00007f719a4fb1ca in start_thread () from /lib64/libpthread.so.0                                                                                          
#19 0x00007f7199633e73 in clone () from /lib64/libc.so.6                                                                                                       
(gdb) f 0                                                                                                                                                      
#0  0x000000000063135c in NetVConnection::_get_service (service=NetVConnection::Service::TLS_SNI, this=0x0) at /sd/workspace/src/git.ouryahoo.com/Edge/build/_build/build_release_posix-x86_64_gcc_10/trafficserver10.0/build/../../../../_scm/trafficserver10.0/iocore/net/I_NetVConnection.h:653
653       return static_cast<TLSSNISupport *>(this->_get_service(NetVConnection::Service::TLS_SNI));
(gdb) p this                
$2 = (const NetVConnection * const) 0x0                       

Backing out #9997 alleviates the crash.

bneradt commented 1 year ago

Here's this at frame 3:

(gdb) f 3
#3  Http2ConnectionState::_get_configured_initial_window_size (this=0x7f6da49a7310) at ../../../../../../_scm/trafficserver10.0/proxy/http2/Http2ConnectionState.cc:1101
1101    Http2ConnectionState::_get_configured_initial_window_size() const
(gdb) p *this
$5 = {
  <Continuation> = {
    <force_VFPT_to_top> = {
      _vptr.force_VFPT_to_top = 0x8a4248 <vtable for Http2ConnectionState+16>
    },
    members of Continuation:
    handler = (int (Continuation::*)(Continuation * const, int, void *)) 0x630270 <Http2ConnectionState::state_closed(int, void*)>,
    mutex = {
      m_ptr = 0x7f6d57177b40
    },
    link = {
      <SLink<Continuation>> = {
        next = 0x0
      },
      members of Link<Continuation>:
      prev = 0x0
    },
    control_flags = {
      raw_flags = 0
    },
    thread_affinity = 0x0
  },
  members of Http2ConnectionState:
  rx_error_code = {
    cls = ProxyErrorClass::SSN,
    code = 0
  },
  tx_error_code = {
    cls = ProxyErrorClass::NONE,
    code = 0
  },
  session = 0x7f6da49a7308,
  local_hpack_handle = 0x7f6ca4586bb0,
  peer_hpack_handle = 0x7f6ca4586de0,
  dependency_tree = 0x0,
  _cop = {
    <Continuation> = {
      <force_VFPT_to_top> = {
        _vptr.force_VFPT_to_top = 0x8a6468 <vtable for ActivityCop<Http2Stream, DLL<Http2Stream, Continuation::Link_link> >+16>
      },
      members of Continuation:
      handler = (int (Continuation::*)(Continuation * const, int, void *)) 0x63ecb0 <ActivityCop<Http2Stream, DLL<Http2Stream, Continuation::Link_link> >::check_activity(int, Event*)>,
      mutex = {
        m_ptr = 0x7f6d57177b40
      },
      link = {
        <SLink<Continuation>> = {
          next = 0x0
        },
        members of Link<Continuation>:
        prev = 0x0
      },
      control_flags = {
        raw_flags = 0
      },
      thread_affinity = 0x0
    },
    members of ActivityCop<Http2Stream, DLL<Http2Stream, Continuation::Link_link> >:
    _event = 0x7f6d570f9ac0,
    _list = 0x7f6da49a7420,
    _freq = 1
  },
  local_settings = {
    settings = {4096, 0, 100, 65535, 16384, 131072}
  },
  acknowledged_local_settings = {
    settings = {4096, 0, 100, 65535, 16384, 131072}
  },
  peer_settings = {
    settings = {4096, 1, 100, 2097152, 16384, 4294967295}
  },
  static _frame_handlers = {(Http2Error (Http2ConnectionState::*)(Http2ConnectionState * const, const Http2Frame &)) 0x633d50 <Http2ConnectionState::rcv_data_frame(Http2Frame const&)>, (Http2Error (Http2ConnectionState::*)(Http2ConnectionState * const, const Http2Frame &)) 0x639a50 <Http2ConnectionState::rcv_headers_frame(Http2Frame const&)>, (Http2Error (Http2ConnectionState::*)(Http2ConnectionState * const, const Http2Frame &)) 0x63b2a0 <Http2ConnectionState::rcv_priority_frame(Http2Frame const&)>, (Http2Error (Http2ConnectionState::*)(Http2ConnectionState * const, const Http2Frame &)) 0x630f50 <Http2ConnectionState::rcv_rst_stream_frame(Http2Frame const&)>, (Http2Error (Http2ConnectionState::*)(Http2ConnectionState * const, const Http2Frame &)) 0x635520 <Http2ConnectionState::rcv_settings_frame(Http2Frame const&)>, (Http2Error (Http2ConnectionState::*)(Http2ConnectionState * const, const Http2Frame &)) 0x630b30 <Http2ConnectionState::rcv_push_promise_frame(Http2Frame const&)>, (Http2Error (Http2ConnectionState::*)(Http2ConnectionState * const, const Http2Frame &)) 0x631bb0 <Http2ConnectionState::rcv_ping_frame(Http2Frame const&)>, (Http2Error (Http2ConnectionState::*)(Http2ConnectionState * const, const Http2Frame &)) 0x630c40 <Http2ConnectionState::rcv_goaway_frame(Http2Frame const&)>, (Http2Error (Http2ConnectionState::*)(Http2ConnectionState * const, const Http2Frame &)) 0x638320 <Http2ConnectionState::rcv_window_update_frame(Http2Frame const&)>, (Http2Error (Http2ConnectionState::*)(Http2ConnectionState * const, const Http2Frame &)) 0x637db0 <Http2ConnectionState::rcv_continuation_frame(Http2Frame const&)>},
  stream_list = {
    <DLL<Http2Stream, Continuation::Link_link>> = {
      head = 0x0
    },
    members of Queue<Http2Stream, Continuation::Link_link>:
    tail = 0x0
  },
  latest_streamid_in = 31,
  latest_streamid_out = 0,
  stream_requests = {
    <std::__atomic_base<int>> = {
      static _S_alignment = 4,
      _M_i = 16
    },
    members of std::atomic<int>:
    static is_always_lock_free = true
  },
  peer_streams_count_in = {
    <std::__atomic_base<unsigned int>> = {
      static _S_alignment = 4,
      _M_i = 0
    },
    members of std::atomic<unsigned int>:
    static is_always_lock_free = true
  },
  peer_streams_count_out = {
    <std::__atomic_base<unsigned int>> = {
      static _S_alignment = 4,
      _M_i = 0
    },
    members of std::atomic<unsigned int>:
    static is_always_lock_free = true
  },
  total_peer_streams_count = {
    <std::__atomic_base<unsigned int>> = {
      static _S_alignment = 4,
      _M_i = 0
    },
    members of std::atomic<unsigned int>:
    static is_always_lock_free = true
  },
  stream_error_count = 0,
  _peer_rwnd = 10285678,
  _local_rwnd = 6553500,
  _local_rwnd_is_shrinking = false,
  _recent_rwnd_increment = {
    _M_elems = {10485760, 18446744073709551615, 18446744073709551615, 18446744073709551615, 18446744073709551615}
  },
  _recent_rwnd_increment_index = 1,
  _received_settings_counter = {
    _vptr.Http2FrequencyCounter = 0x8a67c8 <vtable for Http2FrequencyCounter+16>,
    _count = {2, 0},
    _last_update = 1695409747
  },
  _received_settings_frame_counter = {
    _vptr.Http2FrequencyCounter = 0x8a67c8 <vtable for Http2FrequencyCounter+16>,
    _count = {2, 0},
    _last_update = 1695409747
  },
  _received_ping_frame_counter = {
    _vptr.Http2FrequencyCounter = 0x8a67c8 <vtable for Http2FrequencyCounter+16>,
    _count = {0, 0},
    _last_update = 0
  },
  _received_priority_frame_counter = {
    _vptr.Http2FrequencyCounter = 0x8a67c8 <vtable for Http2FrequencyCounter+16>,
    _count = {0, 0},
    _last_update = 0
  },
  _outstanding_settings_frames = std::queue wrapping: std::deque with 0 elements,
  continued_stream_id = 0,
  _scheduled = false,
  fini_received = true,
  in_destroy = false,
  recursion = 0,
  shutdown_state = HTTP2_SHUTDOWN_NONE,
  shutdown_reason = Http2ErrorCode::HTTP2_ERROR_NO_ERROR,
  shutdown_cont_event = 0x0,
  fini_event = 0x0,
  zombie_event = 0x0
}
jpeach commented 1 year ago
  1098 uint32_t
  1099 Http2ConnectionState::_get_configured_initial_window_size() const
  1100 {
  1101   ink_assert(this->session != nullptr);
  1102   if (this->session->is_outbound()) {
  1103     return Http2::initial_window_size_out;
  1104   } else {
  1105     uint32_t initial_window_size_in = Http2::initial_window_size_in;
  1106     if (this->session) {
  1107       if (auto snis = session->get_netvc()->get_service<TLSSNISupport>(); : TLSSNISupport *
  1108           snis && snis->hints_from_sni.http2_initial_window_size_in.has_value()) {
  1109         initial_window_size_in = snis->hints_from_sni.http2_initial_window_size_in.value();
  1110       }
  1111     }
  1112
  1113     return initial_window_size_in;
  1114   }
  1115 }

This code seems a bit confused about whether this->session is allowed to be null, seeing as it asserts and assumes it is not null, then also checks whether it is null.

Anyway, for some reason session->get_netvc() returns null?

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. Marking it stale to flag it for further consideration by the community.