istio / ztunnel

The `ztunnel` component of ambient mesh
Apache License 2.0
308 stars 101 forks source link

Reduce stack bloat on inbound #1331

Closed howardjohn closed 1 month ago

howardjohn commented 1 month ago

Builds on https://github.com/istio/ztunnel/pull/1330 First, add new checks. Our inbound flow has 3 nested futures, we only measured the outer one.

Next, two optimizations. Unfortunately, I dont really understand either of them... the stack sizing is a bit of black magic.

  1. Drop #instrument for .instrument(). Result is the same but one uses 500b less?
  2. Inline handle_connection, which saves 1kb

Before

tracing::instrument::Instrumented<ztunnel::proxy::inbound_passthrough::InboundPassthrough::run::>,2464
tracing::instrument::Instrumented<ztunnel::proxy::outbound::Outbound::run::>,1488
tracing::instrument::Instrumented<ztunnel::proxy::socks5::Socks5::run::>,1808
ztunnel::proxy::h2::server::serve_connection<>>,6440
ztunnel::proxy::inbound::Inbound::run::,1400
ztunnel::proxy::inbound::Inbound::serve_connect::{{closure}},3576

After

tracing::instrument::Instrumented<ztunnel::proxy::inbound_passthrough::InboundPassthrough::run::>,1672
tracing::instrument::Instrumented<ztunnel::proxy::outbound::Outbound::run::>,1488
tracing::instrument::Instrumented<ztunnel::proxy::socks5::Socks5::run::>,1808
ztunnel::proxy::h2::server::serve_connection<>>6440
ztunnel::proxy::inbound::Inbound::run::,1400
tracing::instrument::Instrumented<ztunnel::proxy::inbound::Inbound::serve_connect::{{closure}}>,2128
istio-testing commented 1 month ago

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

ilrudie commented 1 month ago

This looks good. I do wonder if we can get most/all of the benefit by just asking the compiler to inline the call rather than using a macro though.

keithmattix commented 1 month ago

+1 to avoiding a macro if we can

ilrudie commented 1 month ago

Inlining has no impact. I can't say I understand why, but also feel the 1kb/connection benefit here is worth a small macro... would love to avoid it if anyone has a solution, though.

That is ultimately just a compiler suggestion. Things I could see impacting it are:

  1. its a method and for some reason that form throws the compiler off, even though it's really just some syntactic sugar.
  2. the method/function we asked to inline uses private methods/function and thus inline is not possible, essentially the same things you encountered when you made the macro

Anyhow, if you want to fiddle more go for it, but otherwise I think we did our due diligence on keeping ztunnel's syntax simpler for folks and came up lacking this time. I'll go ahead and approve but add a dnm in case you'd prefer to fiddle around with inlining more based on the suggestions I left.