Trace SDK: introduce span limits support

bio-aeon commented 2 months ago

Resolves #481

It's a kind of pre-PR. If anyone thinks that something could be better done differently, let's discuss. Also need to agree on how exactly to apply OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT - should it be some method of Attributes trait or something else.

iRevive commented 2 months ago

Hey, thanks for the contribution! The concept looks promising. I will take a thorough review this week.

iRevive commented 1 month ago

Also need to agree on how exactly to apply OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT - should it be some method of Attributes trait or something else.

Good question. I don't see a simple way to make Attributes respect the value limit.

We can truncation attribute values in the SdkSpanBackend and SdkSpanBuilder:

SdkSpanBackend#addAttributes
SdkSpanBackend#addEvent
SdkSpanBackend#addEvent

SdkSpanBuilder#addAttribute
SdkSpanBuilder#addAttributes
SdkSpanBuilder#addLink

Or, we can squeeze the truncation logic in the specialized implementation of the LimitedData:

object LimitedData {
  def attributes(sizeLimit: Int, valueLengthLimit: Int): LimitedData[Attribute[_], Attributes] = 
    AttributesLimitedData(sizeLimit, valueLengthLimit)

  private final case class AttributesLimitedData(...) extends LimitedData[Attribute[_], Attributes] {
    def append(a: Attribute[_]): LimitedData[Attribute[_], Attributes] = 
      if (elements.size < limit) {
        copy(elements = elements.append(limitValueLength(a)))
      } else {
        copy(dropped = dropped + 1)
      }

    private def limitValueLength(a: Attribute[_]) = 
      a.key.`type` match {
        case AttributeType.String =>
          Attribute(a.key.name, value.asInstanceOf[String].take(valueLengthLimit))
        case AttributeType.StringSeq =>
          Attribute(a.key.name, value.asInstanceOf[Seq[String]].take(valueLengthLimit))
        case AttributeType.BooleanSeq =>
          ...
        case _ => 
          a
     }
  }
}

iRevive commented 1 month ago

We should also use SpanLimitsAutoConfigure in the TracerProviderAutoConfigure:

for {
  ...
  processors <- configureProcessors(config, exporters)
+ spanLimits <- SpanLimitsAutoConfigure[F].configure(config)

  tracerProviderBuilder = {
    val builder = SdkTracerProvider
      .builder[F]
      ...
+     .withSpanLimits(spanLimits)

    processors.foldLeft(builder)(_.addSpanProcessor(_))
  }

  ...
} yield tracerProvider

That way, the TracerProvider will use limits configured via system properties or env variables (or the default ones).

bio-aeon commented 1 month ago

Or, we can squeeze the truncation logic in the specialized implementation of the LimitedData

On the one hand, yes, it would be nice to avoid calling truncation method in multiple places. On the other hand, the logic of LimitedData methods is now generalized using Semigroup, and it would be nice to find a way not to rewrite these methods completely only for Attributes (you can argue at this point). I'll try to think this week a bit.

We should also use SpanLimitsAutoConfigure in the TracerProviderAutoConfigure

Done.

NthPortal commented 1 month ago

my only concern here is that, because Attributes is unordered, this will end up dropping required attributes in favour of optional attributes

bio-aeon commented 1 month ago

@iRevive I have added attribute values truncation inside of LimitedData similar to what you mentioned above, but keeping generalized LimitedData methods implementation. If this is ok to you, then it seems the only thing left to decide is what to do with attributes sorting.

iRevive commented 1 month ago

my only concern here is that, because Attributes is unordered, this will end up dropping required attributes in favour of optional attributes

It's a valid concern indeed. I will address this in the follow-up PR.

iRevive commented 1 month ago

Hm, the specification doesn't mandate that the order must be preserved.

Here is a breakdown per implementation: 1) Javascript - Attributes is a key-value object. Order is preserved 2) Python - Attributes is a dictionary. Order is preserved 3) Java - ArrayBasedAttributes - a user-facing implementation, i.e. when you use JAttributes.empty or JAttributes.builder - attributes are sorted by key. AttributesMap - attributes are stored in the HashMap, the order may vary.

This weekend, I've experimented with different structures: Vector, ListMap, VectorMap, etc. Generally, other structures tend to be slower than the Map.

I would keep things as-is for now.

NthPortal commented 1 month ago

AttributesMap is also not spec-compliant, because it's mutable (and possibly other reasons as well? I don't recall)

iRevive commented 1 month ago

AttributesMap is also not spec-compliant, because it's mutable (and possibly other reasons as well? I don't recall)

Yep. I see that Otel java uses AttributesMap in the SdkSpan and SdkSpanBuilder. Since you cannot access current attributes (e.g. span.getAttributes and builder.getAttributes) they are fine with that.

And user-facing implementation (ArrayBasedAttributes) is immutable.

typelevel / otel4s

Trace SDK: introduce span limits support #719