pixie-io / pixie

Instant Kubernetes-Native Application Observability
https://px.dev
Apache License 2.0
5.49k stars 424 forks source link

Extend Pixie's Otel export to support histograms #1968

Open ddelnano opened 1 month ago

ddelnano commented 1 month ago

Pixie is designed with the idea that high visibility telemetry should remain on the edge. While that's the case, its still possible and encouraged to use our Open Telemetry export feature to co-locate Pixie's data source with other data making it possible to integrate Pixie's data source into other systems. Despite this being useful, exporting the spans as they exist within the data tables diverges a bit from Pixie's philosophy -- it results in transmitting a large number of spans to central storage.

One common use case for service observability is monitoring the p75, p95, p99 of a service. This is very common on per service dashboards and are often used for SLOs and other alerting. It would be great if Pixie had the primitives for exporting summarized views to power experiences like this without exporting all of the spans. This would yield the best of both worlds -- unaggregated data would be accessible via Pixie's in memory store and the histograms would be persisted in the observability backend.

Computing these percentiles accurately is expensive since the backend store must query the unaggregated data points to perform the computation. Rather than computing these at read time, Pixie's edge processing could extend its Otel export to create cluster wide, per-service Otel histograms. Not only does this save on storage costs, but it also reduces load on the metrics backend and network transfer by serving a fraction of what would normally need to be processed.

kpattaswamy commented 1 month ago

+1 to this, I believe a feature to add scalar args (in order to pass the number of buckets/scale from Pxl) would also be needed to add the explicit/exponential histogram UDAs to carnot