Azure / azure-cosmos-dotnet-v3

.NET SDK for Azure Cosmos DB for the core SQL API
MIT License
742 stars 495 forks source link

Distributed Tracing/Open Telemetry: Feature requests #4553

Open sourabh1007 opened 5 months ago

sourabh1007 commented 5 months ago

Query sanitization (https://github.com/Azure/azure-cosmos-dotnet-v3/pull/4664)

  1. Parameterized queries shouldn't be sanitized (because the user self-sanitizes through parameters), and collecting parameter values will be an opt in config
  2. For all other queries, text should only be captured if it's sanitized. Literal values should be replaced by ?

    Batch ( https://github.com/Azure/azure-cosmos-dotnet-v3/pull/4622 )

  3. New attribute to capture batch size
  4. Operation name should be prepended with BATCH for homogenous operations, or be just BATCH for heterogenous operations. If the DB system has a native name that makes sense we can use that (ex. our API names for ReadMany, TransactionalBatch etc.)

    Bulk

  5. Operation name should be prepended with Bulk. : Covered as part of https://github.com/Azure/azure-cosmos-dotnet-v3/pull/4643
  6. Verify bulk user experience. : Done in Scrum Call

Span name ( https://github.com/Azure/azure-cosmos-dotnet-v3/issues/4553#issuecomment-2264312696)

  1. Change span name to <operation name> <target> where target is the collection name if available
  2. This removes the <namespace> or account name in our case from the span name

Operation Name db.operation.name (https://github.com/Azure/azure-cosmos-dotnet-v3/pull/4643)

Batch/bulk operations:

Change feed operations:

Conflicts operations:

Container operations:

Database operations:

Encryption key operations:

Item operations:

Permission operations:

Stored procedure operations:

Throughput operations:

Trigger operations:

User operations:

User-defined function operations:

Other changes, as mentioned in https://github.com/Azure/azure-cosmos-dotnet-v3/issues/4553#issuecomment-2231991812

(https://github.com/Azure/azure-cosmos-dotnet-v3/pull/4765)

db.cosmosdb.operation_type --> db.cosmosdb.operation.type (or we can get rid of this attribute entirely since we have clear operation names now, not sure this still adds any value) db.cosmosdb.client_id --> db.cosmosdb.client.id (aligns with OTel messaging conventions) db.cosmosdb.request_content_length --> db.cosmosdb.request.content_length (aligns with OTel HTTP conventions) db.cosmosdb.request_charge, no change for this one since "request charge" is one thing rather than charge being an attribute of request db.cosmosdb.consistency_level

Inprogress

a) add db.cosmosdb.regions_contacted b) change casing for direct and gateway mode c) add db.cosmosdb.row_count

Versioning

database : Show only stable and new experimental attributes database/dup : Show both stable and old and new experimental attributes (it means attributes can be duplicated here, if an attribute name was changed from db.cosmos.xx -> db.cosmos.yy, then both the attributes will be emitted) default: Show only old experimental(or stable) => old will target particular version in case of cosmos db, it is semantic-conventions/docs/database/cosmosdb.md at v1.25.0 · open-telemetry/semantic-conventions (github.com)

Few open requests for reference (might or might not be relevant):

  1. https://github.com/Azure/azure-cosmos-dotnet-v3/issues/4375
  2. https://github.com/Azure/azure-cosmos-dotnet-v3/issues/4095
  3. https://github.com/Azure/azure-cosmos-dotnet-v3/issues/3317
jcocchi commented 4 months ago

Sanitization reference: sanitization of db.query.text

Batch reference: Guidance throughout multiple areas of OTel database semantic convention common attributes

Span name reference: name

jcocchi commented 4 months ago

Adding a few more requirements for attribute changes from the new OTel specification Full list found in the database specification.

Attributes to rename

New attributes from spec not in our current implementation

Existing attributes to add to spec Is there a reason we left these off the spec? Once we have a GA release it’s probably good to document them.

CC: @sourabh1007

sourabh1007 commented 4 months ago

@jcocchi

server.port => is it port number used for RNTBD call? and for http it will be always 443? user_agent.original => it should be there https://github.com/Azure/azure-cosmos-dotnet-v3/blob/016e19f6de02742c87a75e4d95e816b6d277034e/Microsoft.Azure.Cosmos/src/Telemetry/OpenTelemetry/OpenTelemetryCoreRecorder.cs#L153

Regarding other attributes, we decided to have these attributes but we left them for otel specs, because these were very specific to cosmos db.

jcocchi commented 4 months ago

We can leave off server.port if it isn't useful for us, I just wanted to call out it's in the spec but missing from our implementation.

user_agent.original didn't show up in the list of attributes in my App Insights traces when I ran my sample app yesterday, are you sure it's still populating correctly? Maybe App Insights just doesn't show it in the UI?

Once we GA I think it's okay to add all the attributes to the spec provided we don't plan to change them.

sourabh1007 commented 3 months ago

Span name is already <operationname><space><collectionname>

image

Code ref. https://github.com/Azure/azure-cosmos-dotnet-v3/blob/aba2b2ce4c81914426484b870d88a76bb7413bbe/Microsoft.Azure.Cosmos/src/Telemetry/OpenTelemetry/OpenTelemetryRecorderFactory.cs#L49

sourabh1007 commented 3 months ago

What do you think about changing, db.cosmosdb.operation_type to db.operation.type since we have db.operation.name and db.operation.batch.size?