Open TylerHelmuth opened 1 year ago
Pinging code owners for receiver/k8sevents: @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself.
Pinging code owners for receiver/k8sobjects: @dmitryax @hvaghani221. See Adding Labels via Comments if you do not have permissions to add labels yourself.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Comparing events collected by k8sobjectsreceiver and k8seventsreceiver the biggest gap we'll need to fill is that the k8sobjectsreceiver is pretty ungraceful with the way it formats the data. The k8sobjectsreceiver puts the json event into the body of the log whereas the k8seventsreceiver attempts to build a more wholistic otlp log based on the event. Specifically the k8seventsreceiver does this:
// k8sEventToLogRecord converts Kubernetes event to plog.LogRecordSlice and adds the resource attributes.
func k8sEventToLogData(logger *zap.Logger, ev *corev1.Event) plog.Logs {
ld := plog.NewLogs()
rl := ld.ResourceLogs().AppendEmpty()
sl := rl.ScopeLogs().AppendEmpty()
lr := sl.LogRecords().AppendEmpty()
resourceAttrs := rl.Resource().Attributes()
resourceAttrs.EnsureCapacity(totalResourceAttributes)
resourceAttrs.PutStr(semconv.AttributeK8SNodeName, ev.Source.Host)
// Attributes related to the object causing the event.
resourceAttrs.PutStr("k8s.object.kind", ev.InvolvedObject.Kind)
resourceAttrs.PutStr("k8s.object.name", ev.InvolvedObject.Name)
resourceAttrs.PutStr("k8s.object.uid", string(ev.InvolvedObject.UID))
resourceAttrs.PutStr("k8s.object.fieldpath", ev.InvolvedObject.FieldPath)
resourceAttrs.PutStr("k8s.object.api_version", ev.InvolvedObject.APIVersion)
resourceAttrs.PutStr("k8s.object.resource_version", ev.InvolvedObject.ResourceVersion)
lr.SetTimestamp(pcommon.NewTimestampFromTime(getEventTimestamp(ev)))
// The Message field contains description about the event,
// which is best suited for the "Body" of the LogRecordSlice.
lr.Body().SetStr(ev.Message)
// Set the "SeverityNumber" and "SeverityText" if a known type of
// severity is found.
if severityNumber, ok := severityMap[strings.ToLower(ev.Type)]; ok {
lr.SetSeverityNumber(severityNumber)
lr.SetSeverityText(ev.Type)
} else {
logger.Debug("unknown severity type", zap.String("type", ev.Type))
}
attrs := lr.Attributes()
attrs.EnsureCapacity(totalLogAttributes)
attrs.PutStr("k8s.event.reason", ev.Reason)
attrs.PutStr("k8s.event.action", ev.Action)
attrs.PutStr("k8s.event.start_time", ev.ObjectMeta.CreationTimestamp.String())
attrs.PutStr("k8s.event.name", ev.ObjectMeta.Name)
attrs.PutStr("k8s.event.uid", string(ev.ObjectMeta.UID))
attrs.PutStr(semconv.AttributeK8SNamespaceName, ev.InvolvedObject.Namespace)
// "Count" field of k8s event will be '0' in case it is
// not present in the collected event from k8s.
if ev.Count != 0 {
attrs.PutInt("k8s.event.count", int64(ev.Count))
}
return ld
}
To generalize the function, the k8seventsreceiver takes interesting parts of the log and intelligently adds them to the resource, body, attributes, and severity. We will need to update the k8sobjectsreceiver to be this intelligent as well.
k8sobjects receiver was designed to send raw k8s objects because the OTel logs/events Spec SIG came to a conclusion to use external schemas for this kind of 3rd-party events instead of defining OTel schemas and semantic conventions for any event coming from external sources. It was decided on one of the calls. That why the k8s objects receiver was introduced to replace k8s events receiver. If we want to change it, we should start with the spec.
Also, I heard complaints from the k8s cluster users about high cardinality attributes.
@dmitryax do you know if there was any OTEP or other documentation that solidified that schema decision?
Can resource attributes still be set such as k8s.pod.name
using the Event's Regarding
fields?
@dmitryax reviewing https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/event-api.md and https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/semantic_conventions/events.md I don't see anything stopping us from proposing meaningful attributes and resource attributes for kubernetes events like the k8seventsreceiver implements.
I stopped attending the Log Spec SIG calls, but the last time I was there, there was an agreement to use log body for the event body and define event.domain
and event.name
to identify the third-party schema. @scheler, can you please update us on how this changed since that?
@dmitryax with https://github.com/open-telemetry/opentelemetry-specification/issues/3681 close we can move forward with deciding which resource attributes the k8sobjectsreceiver should extract.
I propose we start with these resource attributes:
k8s.node.name <- object.reportingInstance
k8s.namespace.name <- object.regarding.namespace
k8s.{regarding.kind}.name <- object.regarding.name
k8s.{regarding.kind}.uid <- object.regarding.uid
Personally I would like to extract more, but I don't think there are any other preexisting k8s resource attributes for us extract.
Additionally we could consider using object.type
to set the log's severity.
@TylerHelmuth, the suggested list sounds reasonable to me. We can start with this 👍
k8s.node.name <- object.reportingInstance
k8s spec notes reportIngInstance as an ID of the reportingController which might not be same as the k8s node name. I think we should leave this one out.
k8s.namespace.name <- object.regarding.namespace k8s.{regarding.kind}.name <- object.regarding.name k8s.{regarding.kind}.uid <- object.regarding.uid
Events are available in 2 api groups, core/v1
and the newer events.k8s.io/v1
. The regarding
field is available only in events.k8s.io/v1
api and looking at the receiver, I think unless the user explicitly setsgroup = events.k8s.io/v1
in the object config, the receiver will query events in v1
version which has the required info in involvedObject
field. If we are adding new attrs, I think it makes sense for these to be available consistently for either api version.
@jinja2 great insight as always. I will update the PR to remove the node name and handle both event versions.
@TylerHelmuth Would you also change object.type
(Normal
or Warning
) to a severity ? And maybe add a label (k8s.event.type
).
Also I'm wondering if object.reason
may be added as a label (k8s.event.reason
).
I would love to move reason
to an attribute (and note
), but I think we need a defined semantic convention first.
I think doing the same logic as k8seventsreceiver to set severity number and severity text is reasonable. @dmitryax @jinja2 do you agree?
To clarify, the changes being proposed to event records in k8sobjects receiver are as below(?) -
The first 3 look reasonable to me. No. 4 though, I don't think it should be an attribute. The note field seems more appropriate in the body of logrecord. Additionally, is there a limit on the size of attribute values in the spec? I think I read it is 255 somewhere, but can't find it now. The note field in a k8s event can be upto 1k in length.
Can we introduce these enhancements behind a config flag in the k8sobjects receiver? Gives user the option to transform the events as they wish w/o having to understand what additions we are making, in case they don't want to follow our convention.
I agree that we should hash out the details of what gets extracted out of the events further. I'll take a look at the current implementation in k8sevents receiver sometime this week and think more on what is worth porting over into k8sobjects recv.
@jinja2 as much as I want 3 and 4, I think we should only stick with 1 and 2 for now. I think 3 and 4 need more concrete acceptance from the otel semantic convention or the k8s event schema, but neither recognize k8s.event.reason
or k8s.event.note
.
@jinja2 do you want to see the existing semantic convention conversions behind a config option as well?
@TylerHelmuth I think a flag for the existing conversion would be good so we don't interfere with any existing transformations users might have setup already. Wdyt?
Since we have ways in other components to turn off other automatic resource attributes it is a good idea. Eventually I'd like it to be turned on by default, but we can follow our breaking change process to get there.
@TylerHelmuth I missed the earlier conversation from this issue which states we are trying to extract attributes defined in k8s convention.
With this context in mind, these 2 extractions
k8s.{regarding.kind}.name <- object.regarding.name
k8s.{regarding.kind}.uid <- object.regarding.uid
will introduce some new attributes which are not mentioned explicitly in the convention. Jfyi, events can also be generated for custom resources by external controllers. So regarding.kind
can be a lot of different types (k8s native and custom resources).
For e.g., this is the list of object kinds with events from one of my test k8s cluster. The last kind is a custom resource maintained by an external AWS controller.
> kc get ev -A -o=json | jq '[.items[].involvedObject.kind] | unique '
[
"ConfigMap",
"CronJob",
"DaemonSet",
"Deployment",
"HorizontalPodAutoscaler",
"Job",
"Node",
"Pod",
"PodDisruptionBudget",
"ReplicaSet",
"Service",
"StatefulSet",
"TargetGroupBinding"
]
While I don't think we should have to explicitly list all the kinds in the convention, I do see value in adding a generic section to document the recommendation that telemetry for k8s object types not explicitly noted in the convention are recommended to have attributes k8s.{k8s.object.kind}.name
and k8s.{k8s.object.kind}.uid
where k8s.object.kind
is lowercase value of the k8s object's kind. This addition to convention might in turn warrant updates to other k8s receivers which might not be using the kind name for objects in attribute. One such discrepancy from the top of my head is between the k8scluster receiver using hpa
(k8s.hpa.uid
attr) instead of horizontalpodautoscaler
which is what we'll see in records from the k8sobjects receiver for hpa events.
I can start a different issue to discuss the convention updates if this sounds like something we'd like to look into further.
@jinja2 love it, we should definitely bring this up to the semantic convention SIG.
Opened https://github.com/open-telemetry/semantic-conventions/issues/430 to discuss the name
and uid
semantic conventions for k8s objects.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Component(s)
receiver/k8sevents, receiver/k8sobjects
Describe the issue you're reporting
With the introduction of the k8sobjectsreceiver the purpose of the k8seventsreceiver is now handle by a more generic component. I propose we deprecate the k8seventsreceiver in favor of the k8sobjectsreceiver.
Work to do: