heroku / roadmap

This is the public roadmap for Salesforce Heroku services.
190 stars 11 forks source link

Heroku Telemetry [OpenTelemetry] #214

Open elimchaysengSF opened 10 months ago

elimchaysengSF commented 10 months ago

Required Terms

What service(s) is this request for?

Heroku Platform

Tell us about what you're trying to solve. What challenges are you facing?

At Heroku, we have a long history of creating fantastic developer experiences by simplifying development down to an opinionated standardized set of tooling that “just works.” Among that tooling is our logging infrastructure, with its ability to quickly access logs via live Tailing sessions and easily export logs via Log Drains to Logging Add-on Partners.

We already use the OpenTelemetry framework inside Heroku for internal metrics, but OpenTelemetry is more than just metrics. It is the trifecta of Traces, Metrics and Logs.

We should build off of our developer experience and logging infrastructure to deliver telemetry for the Heroku platform and Heroku customers while embracing OpenTelemety and join the growing list of contributors/supporters of the standard.

Please comment below for any specific use-cases, metrics, or suggestions related to upgrading our Telemetry and Observability features to utilize the OpenTelemetry standard!

stevenharman commented 9 months ago

We also leverage OpenTelemetry (currently for Traces, and some day for Logs and Metrics). One big hole is our Traces only start once a request is "inside" Dyno. i.e., once code that we control is handling it. In the case of a Rails app, this means at the Rails internals, or maybe web server (Puma) level.

That leaves a pretty big blind spot in terms of how the request/response progressed from the Heroku infra on to our code. We'd love to close that hole.

adamlogic commented 2 weeks ago

As an add-on provider, here are our two big requests related to logging:

stevenharman commented 2 weeks ago
  • We'd love to support our customers who are using Private Space Logging, but currently add-on providers cannot create a log drain for these customers.

This one is tricky; the point of Private Space Logging is that ALL log lines for ALL Apps in the PS go to a single drain. It's then that drain's responsibility to figure out if, where, and how to send each log line. Basically, it's up to you to build your own log multiplexer. This is how we handled things internally at Heroku, anyhow.

  • We'd love to only subscribe to Heroku router logs and runtime metrics logs…

Yes, agreed! Being able to specify a set our "sources" for a drain would be 👨‍🍳💋.

elimchaysengSF commented 3 days ago

Hey Adam!

Thanks for raising this points and giving us good conversation as we go more detailed into the buildout of our OTel plans.

We'd love to support our customers who are using Private Space Logging, but currently add-on providers cannot create a log drain for these customers.

We're planning to support multiple space level drains as part of this effort. We're removing this limitation, so when we get to the point of integrating 3rd party vendors, Space Drain support should be a thing we include.

We'd love to only subscribe to Heroku router logs and runtime metrics logs. App-level logs often contain sensitive info we'd rather not ever see. It would also reduce the burden on our infrastructure if we didn't have all of this unwanted log data flowing in.

We've tangentially discussed this one related to how we segment and categorize the sources for the drains in a more detailed way than we do today.

We'd have something like the hierarchy of telemetry drains: Space -> App -> Source

Where the Source is things like the individual Formation types (web, worker, release, etc), or other streams of telemetry like the router, heroku api, 3rd and first party systems heroku postgres etc. This source list would be were you could configure and subscribe to only the drains you want and need.