Open dnaik-sfx opened 5 months ago
related to #512
@trask Any objections to me putting up a PR for this, or is this being actively discussed by the DB Client working group? I did see there was some discussion around common DB attributes, which I imagine might be relevant for these metrics as well (i.e, do we create common query metrics conventions like the examples I gave, or database-specific query metrics).
@dnaik-sfx does #735 address your needs?
We'd like to be able to collect database metrics like duration, count, rows returned, etc... at the query/statement level. That PR does add a new metric on duration, but it seems that db.statement
is being left out of the initial set of attributes; to get the duration information for a given query, that attribute would be needed, but the metric itself would likely suit our needs if that attribute was added. We would also want to add the other metrics mentioned above and some others though (these would also need db.statement
or a similar attribute).
I realize there are concerns about the high cardinality when adding db.statement
as an attribute on metrics, but I was thinking that the metrics themselves would either be optional (if there is no value in collecting them without tying them to a query/statement), or the db.statement
attribute itself would be optional.
@trask Sorry to bother you again, but is this (or related topics) being, or planned to be, discussed by the working group? And if so, is there somewhere I can go to follow the discussion (if not here)? Also happy to put up a PR for this myself and help drive it if this would be beyond the scope of what the working group was intended to accomplish.
hey @dnaik-sfx, sorry for the slow response, this issue is on the working group's project board, we are planning to discuss it prior to stabilization, but we may push it post stabilization if we think it can be added after stabilization without introducing any breaking changes
a couple thoughts/questions in the meantime...
would you still want these two:
db.query.count
db.query.duration
if db.statement
is an optional attribute on db.client.operation.duration? (since it's a histogram, you can get counts)
I could see db.query.rows_returned
being added, e.g. db.client.operation.rows_returned
, it's similar to http.client.response.body.size
for db.query.plan
, where would you get the query plan? is this something you're thinking to collect in client instrumentation or some other way?
Hi @trask , thanks for responding. Since we just learned there's an upcoming change to add db.client.operation.
metrics, the more important question would be "Are db.query.
and db.client.operation.
the same thing?" or more precisely, "Is db.query
a subset of db.client.operation.
?".
If the answer is yes, we could be adding more metrics in the db.client.operation.
namespace instead of creating another. If not, replacing db.query.duration
with db.client.operation.duration
wouldn't make sense and make the metric set incomplete.
I feel pretty confident that there's no reason why we cannot go with db.client.operation.
metrics but let us go through the PR and the proposed metric. Also please feel free to let us know what you think.
@seonsfx db.client.operation.duration
includes all database operations, including queries
Moving this to post-stability since we can add new metrics later, but we can continue discussing and clarifying the issue in the meantime.
@trask thanks! What does the current timeline look like for stability? Wondering what the timeframe might look like for these query-level metric conventions to go in.
Given the db.client.operation
convention, I think the following would be the full list of metric conventions we would want to add (and eventually collect):
db.client.operation.rows_returned
db.client.operation.rows_written
db.client.operation.rows_scanned
db.client.operation.read_requests
(like physical read requests in Oracle)db.client.operation.write_requests
db.client.operation.read_bytes
db.client.operation.write_bytes
db.client.operation.shared_read_blocks
(aka the MySQL concept of shared vs local read blocks)db.client.operation.shared_blocks_written
db.client.operation.local_read_blocks
db.client.operation.local_blocks_written
db.client.operation.pages_scanned
db.client.operation.pages_written
db.client.operation.granted_memory
db.client.operation.used_memory
db.client.operation.runtime_memory
db.client.operation.sharable_memory
db.client.operation.wait_time
db.client.operation.cpu_time
db.client.operation.hit_percent
(like a cache hit percent)db.client.operation.plan
Some of these are kind of specific to a given type of database, but it could still be worth capturing them as a shared convention in case smaller or future technologies share the underlying concepts. Open to feedback about where these conventions should be added though. As far as collection goes, we envision typically collecting them from system tables depending on the database implementation. For example, MySQL has the Performance Schema with tables providing statistics about queries that ran in the past.
I'm happy to put up a PR for these metrics, as well as the attributes that would be desired on them, to discuss specifics and in further depth.
@trask I know work towards stability is still ongoing, but any thoughts about the metrics and collection method I listed above and any possible conflicts with patterns/metrics being added now?
Put up a PR for the sake of discussion (with the understanding that work towards stability is still underway and these additional metrics would come after)
Currently, there are some conventions for connection-level metrics defined on the Database Metrics page. Proposing that query-level metrics are added (in a different section on the same page) to define conventions for the stats and info that can be captured for individual queries.
Some examples could include:
db.query.count
(number of times a query was executed)db.query.plan
(execution plan for a query)db.query.duration
(amount of time taken by the query)db.query.rows_returned
(number of rows returned by a query)