Open jwafle opened 12 months ago
How does publisher distinguish the failed pushed telemetry data is due to unidentified telemetry, not other reason (for example network failure)
Hi, @fatsheep9146 , just wanting to understand further the failure mode you are proposing. Are you asking about how the collector might respond differently to a publisher depending on whether the cause of a telemetry push failure is due to unidentified telemetry or some other reason?
If so, taking a look at the code for each data type's handle functions here seems to indicate that a http.StatusBadRequest
should be returned if the body of the request cannot be read or cannot be unmarshaled.
I would suggest that a case be added for an optional configuration that requires telemetry data to contain identifiers. If the data is not identifiable (i.e. does not contain service.name
or something of the like), a 400 Bad Request is returned to the publisher with an error message indicating that the required identifier field was not included (i.e. service.name
required but not included).
This solution would at least allow the publisher to differentiate reasons for the 400 Bad Request status between uninterpretable data (e.g. malformed JSON that cannot be unmarshaled, possibly due to corruption) versus interpretable data that is missing required identifiers (e.g. valid JSON with a resource that does not include a service.name
attribute, which could be due to corruption or never being included).
Is your feature request related to a problem? Please describe. OpenTelemetry collectors can be used to collect metrics from multiple applications at once. In use cases where you are collecting telemetry from services that you may not own, it can be very important to ensure that you are able to identify the services from which you are receiving telemetry signals. Currently, as far as I am aware,
service.name
appears to be the best required identifier of what service is sending telemetry to a given collector.For many use cases, it makes sense to refuse to accept telemetry signals that do not have identifying information (e.g.
service.name
). Currently, the collector can be configured using thefilterprocessor
to drop telemetry that does not have aservice.name
. However, there is no way, as far as I am aware, to warn services that are publishing to the collector that their telemetry data is being dropped by thefilterprocessor
, or any other processor.Describe the solution you'd like I believe that there should be some optional method for configuring OTLP receivers to refuse to accept telemetry data with no identifying information. Essentially, my proposed solution looks something like:
where
require_service_name
is an optional configuration parameter of the HTTP and gRPC OTLP receivers. The expected behavior would be to return a status code indicating bad request if this option is enabled for the given protocol. This would prevent situations where unidentified telemetry signals enter the pipeline and it can become very difficult to assess who sent them.While I understand that SDKs are supposed to generate
service.name
as it is a required field in the semantic conventions, I believe that some validation of that requirement should be able to be enabled in the processor.Describe alternatives you've considered The
filterprocessor
can achieve part of this functionality by simply dropping telemetry that is missingservice.name
, but this does not do anything to indicate that there is an issue to the publisher(s) of said telemetry. Additionally, IP traffic could be used to attempt to figure out which machines are sending unidentified telemetry, but this solution would be much more robust.Additional context This is my first time submitting an issue, so please excuse any mistakes I may have made in the process. Also, I would very much be willing to submit a PR related to this issue if a supportive consensus is reached.
Additionally, as far as I am aware, there is no validation of any semantic convention required fields at the processor level. If there is and I am simply missing it, please let me know. I would like to troubleshoot