envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.93k stars 4.8k forks source link

Crash since 3e1806f1e12ec #10655

Closed rgs1 closed 4 years ago

rgs1 commented 4 years ago

On Friday we synced up with 3e1806f1e12ec and today we got this crash:

Caught Segmentation fault, suspect faulting address 0x0
Backtrace (use tools/stack_decode.py to get line numbers):
Envoy version: 0/1.14.0-dev//RELEASE/BoringSSL
#0: __restore_rt [0x7fa186a08890] ??:0
#1: Envoy::Network::ConnectionImpl::onLowWatermark() [0x55940ecdef7a] ??:0
#2: Envoy::Network::ConnectionImpl::closeSocket() [0x55940ecde4a8] ??:0
#3: Envoy::Network::ConnectionImpl::close() [0x55940ecde1aa] ??:0
#4: Envoy::Http::Http1::ConnPoolImpl::onResponseComplete() [0x55940ee5d145] ??:0
#5: Envoy::Http::ResponseDecoderWrapper::decodeHeaders() [0x55940ee5d74c] ??:0
#6: Envoy::Http::Http1::ClientConnectionImpl::onMessageComplete() [0x55940ef78073] ??:0
#7: Envoy::Http::Http1::ConnectionImpl::onMessageCompleteBase() [0x55940ef75945] ??:0
#8: Envoy::Http::Http1::ConnectionImpl::$_8::__invoke() [0x55940ef79b8d] ??:0
#9: http_parser_execute [0x55940f1553e6] ??:0
#10: Envoy::Http::Http1::ConnectionImpl::dispatchSlice() [0x55940ef74a17] ??:0
#11: Envoy::Http::Http1::ConnectionImpl::dispatch() [0x55940ef7477f] ??:0
#12: Envoy::Http::CodecClient::onData() [0x55940eedf248] ??:0
#13: Envoy::Http::CodecClient::CodecReadFilter::onData() [0x55940eee007d] ??:0
#14: Envoy::Network::FilterManagerImpl::onContinueReading() [0x55940ece49c9] ??:0
#15: Envoy::Network::ConnectionImpl::onReadReady() [0x55940ecdfee7] ??:0
#16: Envoy::Network::ConnectionImpl::onFileEvent() [0x55940ecdf20d] ??:0
#17: Envoy::Event::FileEventImpl::assignEvents()::$_0::__invoke() [0x55940ecda5f5] ??:0
#18: event_process_active_single_queue [0x55940f14eea4] ??:0
#19: event_base_loop [0x55940f14d75e] ??:0
#20: Envoy::Server::WorkerImpl::threadRoutine() [0x55940eccf1ff] ??:0
#21: Envoy::Thread::ThreadImplPosix::ThreadImplPosix()::$_0::__invoke() [0x55940f206055] ??:0
#22: start_thread [0x7fa1869fd6db] ??:0

I was suspecting #10406 tho @mattklein123 suggests #10566.

The commits we picked up on Fri are:

    [test] Convert load_balancer_benchmark to benchmark cc binary and test framework https://github.com/envoyproxy/envoy/pull/10539
    [http1] Buffer pending http/1 body before dispatching to the filter chain https://github.com/envoyproxy/envoy/pull/10406
    tls: config to disable TLS session tickets https://github.com/envoyproxy/envoy/pull/10178
    use constexpr string_view to avoid static initialized string https://github.com/envoyproxy/envoy/pull/10632
    build: fix merge conflict with addAcceptFilter(). https://github.com/envoyproxy/envoy/pull/10629
    ci: set explicit timeout for release builds. https://github.com/envoyproxy/envoy/pull/10626
    doc: certificate hot-reload for xDS gRPC connection https://github.com/envoyproxy/envoy/pull/10628
    [test] Convert filter_chain_benchmark_test to benchmark cc binary and test framework https://github.com/envoyproxy/envoy/pull/10538
    test: fuzzer binaries should parse gmock flags. https://github.com/envoyproxy/envoy/pull/10606
    tools: refactor protoxform to distinct transform/pretty-print stages. https://github.com/envoyproxy/envoy/pull/10585
    listener: implement disabled predicates https://github.com/envoyproxy/envoy/pull/10389
    test: deflake by making registerTestServerPorts thread safe https://github.com/envoyproxy/envoy/pull/10523
    registry: handle factories displaced by type https://github.com/envoyproxy/envoy/pull/10603

The last crash free sync included these commits (tho it only ran for 24 hours before the next sync):

    util: add PROXY protocol generation functions https://github.com/envoyproxy/envoy/pull/10548
    router: API cleanup for unit tests https://github.com/envoyproxy/envoy/pull/10590
    eds: introduce hostname for endpoints and health checks https://github.com/envoyproxy/envoy/pull/10456
    test: use static functions rather than static data for std::string constants for config fragments https://github.com/envoyproxy/envoy/pull/10569
    tracing: Fix X-Ray header values https://github.com/envoyproxy/envoy/pull/10598
    Add clang-tidy rule to enforce lower camelCase naming of function https://github.com/envoyproxy/envoy/pull/10477
    sds: certificate hot-reload for xDS gRPC connection https://github.com/envoyproxy/envoy/pull/10163
    Add some debug logs during server shutdown https://github.com/envoyproxy/envoy/pull/10577
    stats: Remove Scope's counter(), gauge(), and histogram() interfaces, renaming them to counterFromString et al. https://github.com/envoyproxy/envoy/pull/10300
    compdb: add missed query https://github.com/envoyproxy/envoy/pull/10600
    handling header-only envoy_cc_test_library better https://github.com/envoyproxy/envoy/pull/10596
    build: standardize on PY3 https://github.com/envoyproxy/envoy/pull/10586
    compdb: handling header-only cc_library better https://github.com/envoyproxy/envoy/pull/10583
    ci: refactor azp into one pipeline https://github.com/envoyproxy/envoy/pull/10564
    Revert "tools: enhance type DB to span frozen/active major versions. https://github.com/envoyproxy/envoy/pull/10571" https://github.com/envoyproxy/envoy/pull/10587
    deps: update spdlog to 1.4.0 and fmtlib to 6.0.0 https://github.com/envoyproxy/envoy/pull/10522
    timeformatter static-init https://github.com/envoyproxy/envoy/pull/10579
    tools: enhance type DB to span frozen/active major versions. https://github.com/envoyproxy/envoy/pull/10571
    tools: fix protoxform_test. https://github.com/envoyproxy/envoy/pull/10582
    http1: Allocate encoder on heap to survive move https://github.com/envoyproxy/envoy/pull/10561
    AWS Lambda integration test fix https://github.com/envoyproxy/envoy/pull/10558
    Shorten adaptive_concurrency/concurrency_controller paths https://github.com/envoyproxy/envoy/pull/10560
mattklein123 commented 4 years ago

I'm pretty positive this is https://github.com/envoyproxy/envoy/issues/10566. I will fix this so we can get the fix in before we release 1.14.0.