Azure / azure-sdk-for-java

This repository is for active development of the Azure SDK for Java. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/java/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-java.
MIT License
2.37k stars 2k forks source link

[BUG] The OpenTelemetry INTERNAL span reported BlobClientBase.exists() is marked as an error and attaches an exception stack trace #42452

Open trask opened 1 month ago

trask commented 1 month ago

Calling BlobClientBase.exists() produces two spans (which is expected):

an INTERNAL span:

SpanData{spanContext=ImmutableSpanContext{traceId=92c3d060568c79c1fafbc818297ea4af, spanId=b98cca7f7c875f09, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=true}, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, resource=Resource{schemaUrl=null, attributes={service.name="unknown_service:java", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.42.1"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=azure-storage-blob, version=12.28.0, schemaUrl=https://opentelemetry.io/schemas/1.17.0, attributes={}}, name=AzureBlobStorageBlob.getPropertiesNoCustomHeaders, kind=INTERNAL, startEpochNanos=1729222313878000000, endEpochNanos=1729222319944232700, attributes=AttributesMap{data={thread.id=1, thread.name=main, az.namespace=Microsoft.Storage}, capacity=128, totalAddedValues=3}, totalAttributeCount=3, events=[ImmutableExceptionEventData{epochNanos=1729222319942186100, exception=com.azure.storage.blob.implementation.models.BlobStorageExceptionInternal: Status code 404, ContainerNotFound, additionalAttributes={}, spanLimits=SpanLimitsValue{maxNumberOfAttributes=128, maxNumberOfEvents=128, maxNumberOfLinks=128, maxNumberOfAttributesPerEvent=128, maxNumberOfAttributesPerLink=128, maxAttributeValueLength=2147483647}}], totalRecordedEvents=1, links=[], totalRecordedLinks=0, status=ImmutableStatusData{statusCode=ERROR, description=}, hasEnded=true}

and a nested CLIENT span:

SpanData{spanContext=ImmutableSpanContext{traceId=92c3d060568c79c1fafbc818297ea4af, spanId=0bc936d658ee0899, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=true}, parentSpanContext=ImmutableSpanContext{traceId=92c3d060568c79c1fafbc818297ea4af, spanId=b98cca7f7c875f09, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=true}, resource=Resource{schemaUrl=null, attributes={service.name="unknown_service:java", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.42.1"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=azure-storage-blob, version=12.28.0, schemaUrl=https://opentelemetry.io/schemas/1.17.0, attributes={}}, name=HEAD, kind=CLIENT, startEpochNanos=1729222313971242900, endEpochNanos=1729222316895604300, attributes=AttributesMap{data={applicationinsights.internal.operation_name=AzureBlobStorageBlob.getPropertiesNoCustomHeaders, http.request.resend_count=1, http.url=https://trasktest.blob.core.windows.net/test/test, thread.id=1, az.client_request_id=98c0ae84-0dd9-457d-99fa-fb31220c71aa, az.service_request_id=f0b0a3d5-001e-0018-6a0e-2176da000000, server.port=443, http.method=HEAD, thread.name=main, server.address=trasktest.blob.core.windows.net, http.status_code=404, az.namespace=Microsoft.Storage}, capacity=128, totalAddedValues=12}, totalAttributeCount=12, events=[], totalRecordedEvents=0, links=[], totalRecordedLinks=0, status=ImmutableStatusData{statusCode=ERROR, description=404}, hasEnded=true}

isn't not surprising that the CLIENT span has status ERROR, since it's probably captured by lower-level HTTP instrumentation which doesn't know that a 404 is an expected response code for this operation

what's surprising is that the INTERNAL span has status ERROR and attaches (an oftentimes large) exception stacktrace, even though the call to BlobClientBase.exists() doesn't throw an exception but instead just returns false when the blob is not found.

repro at https://github.com/trask/azure-blob-storage-test

cc @lmolkova @jeanbisutti @heyams @harsimar

github-actions[bot] commented 1 month ago

@ibrahimrabab @ibrandes @kyleknap @seanmcc-msft

github-actions[bot] commented 1 month ago

Thank you for your feedback. Tagging and routing to the team member best able to assist.

alzimmermsft commented 1 month ago

Switching ownership to Core as Storage doesn't do anything special with span creation.

exists(), and the equivalent APIs that are speculative, have handling on the client side to return a better response when possible. For example, Storage Blob doesn't have a specific API for checking existence of a blob (or container) so an attempt on getProperties is made with catching if a 404 is returned to indicate false. So, from the REST API perspective this did fail but from an application perspective it didn't, which may be something that needs to be dug into as tracing and runtime are reporting different results.