open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.76k stars 890 forks source link

Clarify spans export concurrency #4205

Closed pellared closed 2 months ago

pellared commented 2 months ago

Same as https://github.com/open-telemetry/opentelemetry-specification/pull/4173 for trace SDK.

Fixes https://github.com/open-telemetry/opentelemetry-specification/issues/4134

carlosalberto commented 2 months ago

Ping @tsloughter

tsloughter commented 2 months ago

Before my PR https://github.com/open-telemetry/opentelemetry-specification/pull/2452 I was originally pushing for a change to allow Export() to be called concurrently. I wish I could remember who it was that most convinced me so I could ping them. I believe it was the simplicity that sold me

If we make a change that Export can be called concurrently or make the current BSP legacy or add a new ConcurrentBSP it means exporters written with this guarantee in mind (example: writing to a raw file) will now fail in confusing ways until they are updated to support concurrent calls. I don't think keeping the current processor and having another that calls concurrently is great either since now the user has to ensure they use the right one.

Implementations exist that support concurrency from the BatchProcessor based on the changes we did before:

I know this PR was only about limiting it to the builtin processors and not changing the builtin processors but I think important to keep this standard for all processors, even if it can't be guaranteed by the API design by everyone to not add this confusion to the user. Originally I thought it was just clarifying it for all processors but had misread :).

pellared commented 2 months ago

@tsloughter, I think that this PR still keeps the statements that discourages from having concurrent-safe exporters. It mostly changes "will" into "should" as e.g. not all languages are able to enforce such constraint. It adds normative requirements to built-in processors as this is a place where all languages have control. Having more normative statements in places where they can be added helps in reviewing/auditing the SDK implementations. As you can see, I do not want to make further changes in this PR which may be controversial. I think the proposed changes are mostly editorial+clarification.

tsloughter commented 2 months ago

@pellared right, that is how I initially read it but I think the should instead of must for replacing will changes the meaning too much. A should on this basically makes it a must support concurrency since you don't actually know. Even if today you know all processors in your language won't call it concurrently that is no guarantee that an upgrade won't change that in a third party processor or a new processor added to the spec that the user could enable instead of Simple or Batch.

Having it be a must you can at least point to the spec if it breaks and say, 'this processor isn't spec compliant', when an exporter doesn't function properly :).

I mainly want to make sure my concerns are clear but not actually hold this up if it has enough support. I think it prevents merging if there are unresolved conversations? If no one shares my concerns, or thinks things need to be done in a new PR after this one is merged, I can resolve them to not block on just my concern.

pellared commented 2 months ago

@carlosalberto, I think we have a consensus and this PR can be merged. More refining PRs are welcome but let's avoid scope creep in this PR.

carlosalberto commented 2 months ago

Will merge by EOD today as, yes, we simply clarify the existing behavior of simple/batch processors.

Btw @jmacd @bogdandrutu as you are interested in concurrent capabilities, maybe we can think of expanding this for future processors. Maybe create an issue to track that? Otherwise, wait for it to happen, etc.

cijothomas commented 2 months ago

Will merge by EOD today as, yes, we simply clarify the existing behavior of simple/batch processors.

Btw @jmacd @bogdandrutu as you are interested in concurrent capabilities, maybe we can think of expanding this for future processors. Maybe create an issue to track that? Otherwise, wait for it to happen, etc.

Opened this for tracking https://github.com/open-telemetry/opentelemetry-specification/issues/4231