envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
25.07k stars 4.82k forks source link

Envoy crashes when calling RecordValue() for a histogram stat. #37252

Open ppourali opened 2 days ago

ppourali commented 2 days ago

If you are reporting any crash or any potential security issue, do not open an issue in this repo. Please report the issue via emailing envoy-security@googlegroups.com where the issue will be triaged appropriately.

Title: Cannot Record values for a histogram stat.

Description: I have created some stats for Envoy using the following snippet:

#define ALL_STATS(COUNTER, GAUGE, HISTOGRAM) \
  COUNTER(processed)                         \
  GAUGE(duration, Accumulate)                        \
  HISTOGRAM(latency, Milliseconds)

struct AllStats {
  ALL_STATS(GENERATE_COUNTER_STRUCT, GENERATE_GAUGE_STRUCT,
                            GENERATE_HISTOGRAM_STRUCT)
};

However, when I try to record the value for the histogram I get the following error. Note that incrementing the Counter values work just fine, but the histogram throws errors because of how it interacts with the tls.

stats_->latency.recordValue(1000);

E1119 20:23:09.289219 2475011 thread_local_impl.cc:107] [assert] assert failure: currentThreadRegisteredWorker(index). SIGABRT received by PID 13215468 (TID 1325468) on cpu 7 from PID 1325468; stack trace: PC: @ 0x7f1917fd6981 (unknown) gsignal @ 0x55921d77166f 1920 FailureSignalHandler() @ 0x7f1918131e80 (unknown) (unknown) @ 0x559219e15ae8 128 Envoy::ThreadLocal::InstanceImpl::SlotImpl::getWorker() @ 0x559219e966d5 144 Envoy::ThreadLocal::Slot::getTyped<>() @ 0x559219e8be27 192 Envoy::Stats::ThreadLocalStoreImpl::tlsHistogram() @ 0x559219e8d65c 32 Envoy::Stats::ParentHistogramImpl::recordValue() @ 0x55921990d33f 960 absl::internal_any_invocable::RemoteInvoker<>() @ 0x559218e8c3e2 48 absl::internal_any_invocable::RemoteInvoker<>() @ 0x55921d3b82c5 352 thread::TMWorker() @ 0x55921d5a08ba 256 Thread::ThreadBody() @ 0x7f19181287db 192 start_thread @ 0x7f191809b05f (unknown) clone

Any idea on this? Would that be related to stats creation and recordValue are executed on different threads?

Thanks, Parsa

soulxu commented 2 days ago

cc @wbpcode

wbpcode commented 2 days ago

cc @jmarantz

jmarantz commented 2 days ago

What thread were you calling record value from? I believe you can only do that from a worker thread or the main thread.

From the stack trace you are bumping into an assert that enforces this decision.

On Wed, Nov 20, 2024, 7:17 AM code @.***> wrote:

cc @jmarantz https://github.com/jmarantz

— Reply to this email directly, view it on GitHub https://github.com/envoyproxy/envoy/issues/37252#issuecomment-2488433176, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO2IPLCDKEKUMAIWQQBRAD2BR4U3AVCNFSM6AAAAABSDAYHLWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOBYGQZTGMJXGY . You are receiving this because you were mentioned.Message ID: @.***>