sfackler / rust-postgres

Native PostgreSQL driver for the Rust programming language
Apache License 2.0
3.43k stars 436 forks source link

Tracing support #1053

Open Protryon opened 1 year ago

Protryon commented 1 year ago

This PR:

Example trace overview on real application (vaultwarden):

direct

Breaking down to find a slow query:

direct

Note that this PR does not add distributed tracing support into the database, as it is (AFAIK) experimental and not well supported -- it would also add some extra dependencies.

Fixes #642

sfackler commented 1 year ago

What are some examples of other database drivers that have adopted these logging conventions?

Protryon commented 1 year ago

@sfackler I did some digging, here is a list at the OTEL registry.

A specific example in code in the first item of using the OTEL semantic conventions, go-pg.

titanous commented 1 year ago

This looks fantastic! I was actually just looking at implementing this exact same thing, and you've done it better than I would have. I will apply this patch and try it out in production for an app that I work on.

titanous commented 1 year ago

This works great, I ended up making two changes:

  1. I set all of the spans to INFO because that is the level that we send to Honeycomb. I think at the very least prepare should be bumped to DEBUG, as trace is almost always too noisy to be useful and prepare spans are useful to me (I try to have a span for every external network call).
  2. I added a span for connect, which delivered value as soon as I deployed it. Feel free to pull in this commit and/or modify it. I have already found the sub-span timings that I added to be useful as well. https://github.com/titanous/rust-postgres/commit/7e2e3a2210dedda7a0f70ca9069d3ca1f57901cb
Protryon commented 1 year ago

This works great, I ended up making two changes:

  1. I set all of the spans to INFO because that is the level that we send to Honeycomb. I think at the very least prepare should be bumped to DEBUG, as trace is almost always too noisy to be useful and prepare spans are useful to me (I try to have a span for every external network call).
  2. I added a span for connect, which delivered value as soon as I deployed it. Feel free to pull in this commit and/or modify it. I have already found the sub-span timings that I added to be useful as well. titanous@7e2e3a2

I merged in that commit and bumped all the trace levels as well. Thanks!

sfackler commented 1 year ago

From what I can tell most or all of the libraries at that OTEL reference are external libraires that you can use with a database driver to apply tracing conventions.

I am hesitant to land this for a couple of reasons:

  1. People can be using this library as a tracing appender, and this change can potentially cause some undesired infinite logging recursion. cf https://github.com/sfackler/rust-postgres/issues/215
  2. This seems like a convention that your particular organization wants to use, e.g. "I set all of the spans to INFO because that is the level that we send to Honeycomb", not something that is uniformly applicable.

I think you'd probably be better served by following what those other libraries do and making a separate library that can wrap tokio-postgres and apply the appropriate tracing logic.

titanous commented 1 year ago

I think a separate library would be fine, though it seems like a hook system would be necessary to ergonomically log everything we have in this PR. For example, there is currently no good way for a separate library to properly trace connection attempts with all of the relevant metadata, especially with timings for each phase.

sfackler commented 1 year ago

That approach seems like it could work in principle - I'd be interested in what the hook trait would look like.