open-telemetry / semantic-conventions

Defines standards for generating consistent, accessible telemetry across a variety of domains
Apache License 2.0
221 stars 141 forks source link

Add DynamoDB attributes that uniquely identify sanitized queries/writes, similar to db.statement #303

Open swar8080 opened 10 months ago

swar8080 commented 10 months ago

Suggestion overview

When an application runs the same CRUD statements with different inputs it's helpful to aggregate statistics at the statement-level. This is possible in most databases with the db.statement attribute, which can have values like SELECT foo FROM bar WHERE baz=? LIMIT ? and UPDATE foo SET bar=?. This attribute allows quickly answering questions like:

Those questions take more work to answer with the conventional DynamoDB attributes, since it's not possible to map a span/metric to a specific DynamoDB statement that an application makes

DynamoDB doesn't have a query language like SQL, but there's of course some similarities like:

Th suggestion is a new span and/or metric attribute that is a SQL-like representation of a DynamoDB Query, UpdateItem, PutItem, or DeleteItem request. The attribute's format would need to be intuitive to read and include sanitized information about all request parameters that change the statement's outcome

Examples

Query API

UpdateItem API:

DeleteItem API:

PutItem API:

Contributing

Let me know if this makes sense as a convention, if it's better off as a custom/third-party library, or if you have suggestions for a different approach

I'd be interested in proposing more exact mappings of API requests to attribute values if that would be useful

trask commented 2 months ago

hi @swar8080!

I think this would indeed be useful:

I'd be interested in proposing more exact mappings of API requests to attribute values if that would be useful

btw, we are working toward database semantic convention stability and many things have changed recently so it's worth revisiting the latest changes to the semconv

trask commented 2 months ago

@jcocchi would be interesting to get your perspective given your work on CosmosDB semantic conventions

jcocchi commented 2 months ago

@swar8080, @trask - I agree adding context about the operation to the span is useful!

For Cosmos DB we capture the API name in the db.operation.name attribute rather than converting it to a SQL syntax. Table name (called container for us) is in the db.collection.name. For queries specifically, we also have a SQL-like query syntax which we plan to add to the db.query.text attribute (we don't currently populate this). This would include SQL representations of your KeyConditionExpression, ScanIndexForward, and Limit examples. We don't plan to create SQL syntax for non-query operations.

For non-queries, there may be other attributes to capture for a given request similar to your ConditionalOperator examples. In Cosmos DB we'd be interested in capturing the PartitionKey and Id of the item for example. It makes sense to have custom attributes per implementation for these database specific attributes.