open-telemetry / oteps

OpenTelemetry Enhancement Proposals
https://opentelemetry.io
Apache License 2.0
326 stars 157 forks source link

Trace Payload Collection #219

Closed ronyis closed 1 year ago

ronyis commented 1 year ago

In this OTEP, we propose a way for extending trace functionality to support payload collection. We plan that this change will also enable collecting payload by instrumentations in the future.

This is an important capability in our vision, which could help many OpenTelemetry users to troubleshoot and debug applications much more effectively. We at Epsagon are also committed to pushing this through specifications and development.

We are glad to receive any feedback and to collaborate on this topic!

osherv commented 1 year ago

Related Issues: @pavolloffay #1062 #1284 #1317

@MovieStoreGuy open-telemetry/opentelemetry-specification#176

ahayworth commented 1 year ago

I am concerned that this OTEP significantly expands the scope of the OpenTelemetry project, and would introduce significant complexity in all areas of the ecosystem. And, as mentioned, there are likely ways that already existing portions could be pressed into service for this kind of functionality (even if it isn't a perfect match today).

For me specifically, I would be concerned that this would introduce an additional burden onto SDK maintainers - we already must be extraordinarily careful about the performance implications of the SDKs we produce, but should we accept this OTEP we now must retain that same diligence and care in the face of possibly very large payload values. That is not straightforward and simple, and would likely require redesigns of portions of the SDKs to be even more efficient than they already are. Efficiency is good, of course, but I don't know if this in particular should be the forcing function for such rewrites. 😄

Perhaps it is better to first have broader conversations about what OpenTelemetry is and where the boundaries of its scope lie?

reyang commented 1 year ago

I am concerned that this OTEP significantly expands the scope of the OpenTelemetry project, and would introduce significant complexity in all areas of the ecosystem.

+1 👍

reyang commented 1 year ago

I would suggest:

  1. clarify what type of the payload is intended to be in-scope, and what's not (e.g. would we anticipate the eventual drifting to some extreme situation such as "here is 1 gigabyte of video payload"?).
  2. explore how existing backends (e.g. Jaeger) would support it, and see if there is high interest from the community.
ronyis commented 1 year ago
  1. clarify what type of the payload is intended to be in-scope, and what's not (e.g. would we anticipate the eventual drifting to some extreme situation such as "here is 1 gigabyte of video payload"?).

I think that the scope should be "reasonable" sizes, that are performant enough to collect. Probably a few KBs per span. Definitely not "1 gigabyte of video payload" IMO :). And in the end, users could increase preconfigured limits as much as they like.

  1. explore how existing backends (e.g. Jaeger) would support it, and see if there is high interest from the community. I agree, but not so sure what's the right way to do that.

We can also provide a processor on the OTel Collector that will convert payload attributes into regular attributes as a workaround. Also, based on my experience, storing JSON payloads in Elasticsearch/OpenSearch is quite straightforward.

yurishkuro commented 1 year ago

I just took a second pass at the OTEP, and I see that @tigrannajaryan 's and my comments are largely not addressed. The proposal significantly expands the interface surface of OTEL, both in the API and in OTLP, and the justifications are not convincing, the authors admit that the same things could be achieved with the existing mechanisms and data structures. So why not explore that, maybe propose some semantic conventions.

Even more importantly, capturing payloads is a big data privacy red flag. The OTEP does not discuss this at all. From the implementation perspective, where to stash the data is a much easier problem to solve than how to capture the data in a privacy-respecting way. And while it is possible to decouple these two problems, the actual value of this OTEP would only materialize if it provides a framework for instrumentation to capture payloads, not just an extension of API/SDK on where to record it. And a framework for instrumentation would absolutely have to deal with privacy questions.

ronyis commented 1 year ago

Thanks @yurishkuro. I'm working on compiling the many comments and ideas into the OTEP (also I was offline for a while), representing different alternatives with pros and cons.

About the privacy issue - I focused on APIs in this OTEP, because I think it's the foremost problem that blocks users from collecting payloads with their own instrumentations (or for companies to develop such 3rd party libs). Generally, I think that a configuration based on allow-lists and/or deny-lists for collecting payload attributes should be sufficient. I'm not aware of other ways to deal with this problem and looking forward to hearing about it. I will reference that in the OTEP as well.

nirga commented 1 year ago

Hey @ronyis are you still working on this? If not, wanted to take and complete this as we need it as well :)

ronyis commented 1 year ago

Hey @ronyis are you still working on this? If not, wanted to take and complete this as we need it as well :)

Not actively working on it, feel free to jump in

ronyis commented 1 year ago

Closing this as the work is continued in #234

johncrim commented 4 days ago

@ronyis - I don't see this work at all in https://github.com/open-telemetry/opentelemetry-specification/pull/234 - I think you mean #234 .