How to improve logging and tracing support in future release?

jackc commented 3 years ago

The original design goal for logging was simply to log what queries are being run and how long they take.

There have been a few requests to support additional logging / tracing points or new ways of configuring what is currently logged (#853, #998). The current logging system works fine for me, but as I'm considering the possibility of a new major release I'd like to open the door to breaking changes to the logging / tracing interface.

As I do not have a first-hand need for this I do not have any opinion on what these changes might be, and I would probably not implement them myself, but I wanted to bring up the possible opportunity for those who may have the interest and ability while there is a window open for possible breaking changes.

Tochemey commented 3 years ago

Hello @jackc thanks for such feature. That will help a lot

HakShak commented 3 years ago

Some references:

https://github.com/open-telemetry/opentelemetry-go-contrib/issues/5 https://github.com/XSAM/otelsql

kszafran commented 3 years ago

Is anyone working on this? I'm considering switching to the pgx interface to take advantage of the binary protocol, but if it stops us from integrating with e.g. DataDog in the future (https://github.com/DataDog/dd-trace-go/issues/697) that could be a problem.

Tochemey commented 2 years ago

@jackc any progress on this.

jackc commented 2 years ago

@jackc any progress on this.

I have not heard of anyone working on this.

obitech commented 2 years ago

I'm wondering how practical it would be to provide some sort of hooks or middleware chains which could be triggered before or after an Exec is called on the underlying PgConn. Similar to the BeforeAcquire functionality in a Pool. These hooks could just receive a context which is used for span propagation. WDYT @jackc ?

jackc commented 2 years ago

I'm wondering how practical it would be to provide some sort of hooks or middleware chains which could be triggered before or after an Exec is called on the underlying PgConn. Similar to the BeforeAcquire functionality in a Pool. These hooks could just receive a context which is used for span propagation. WDYT @jackc ?

I like the idea of hooks, but there would need to be a lot more than before / after pgconn.Exec. There's also pgconn.ExecPrepared and pgconn.ExecParams. Then those methods only send the queries. They don't read the results. There would need to be something after the results are received. And maybe when the first result comes back. Also, there are batch requests.

I could see something like:

type BeforeQueryHook interface {
  BeforeQuery(...)
}

Every hook point could have its own interface. The logger or tracer could implement as many or as few hooks as it wanted to. This would allow adding or even removing hooks in a backwards compatible way.

obitech commented 2 years ago

Couple more questions since I'm not very familiar with the inner workings of pgx:

Would there be interfaces for every command like BeforeQueryHook, AfterQueryHook, BeforeQueryRowHook, and so on?
I assume those hooks would need to be present for Conns, Pools and Txs ?

I could try looking into a PR (if you want) but would probably need some guidance along the way

jackc commented 2 years ago

Would there be interfaces for every command like BeforeQueryHook, AfterQueryHook, BeforeQueryRowHook, and so on?

It would be a separate interface for each hook. But I don't think there would need to be a separate one for each method. Query, QueryRow, and QueryFunc at least could all use the same hooks. Maybe Exec as well. My guess would be that hooks for send query, first response received, and query complete would be applicable to all queries.

I assume those hooks would need to be present for Conns, Pools and Txs ?

I don't think so. Txs and pools ultimately call methods on Conn so *pgx.Conn is probably the only thing that would need to be modified. Potentially pgxpool would have a different set of hooks to track acquire time.

mrsufgi commented 2 years ago

Well I guess hooks would be the best approach, but wrapping the driver is a simpler approach - kind of similar to the work they did on: https://github.com/uptrace/opentelemetry-go-extra/tree/main/otelsql

Tracing do work if you use stdlib.OpenDB but I guess everyone would benefit from a native pgx.Conn wrapper instead (for advanced pgx functionalities such as CopyFrom)

I do think that this should not be done as part of this repo as this is an extension (wrapper) on top of the pgx driver. I know this means that it's more "brittle" but necessary but would make things simpler to maintain (as @jackc goals are different from "tracing").

As for tracing engine, I do believe the community should focus on OpenTelemetry (and not OC/DD/Jaeger etc) as its de-facto standard for tracing.

As this is a major blocker for my team and I, I will try the find the time and develop this myself. ( we use sqlc and otelsql / OpenTelemetry with the old lib/pg driver )

My approach would be wrapping pgx.Conn kind of what otelsql did.

For everyone else interested in my current setup (sqlx, pgx driver[stdlib] and otelsql) ->


func (cfg *PostgresConfig) String() string {
    return fmt.Sprintf("host=%s port=%s user=%s password=%s dbname=%s sslmode=disable", cfg.Host, cfg.Port, cfg.User, cfg.Password, cfg.DB)
}

func NewPostgres(ctx context.Context, pgcfg ...PostgresConfig) *sqlx.DB {
    cfg := newPostgresConfigDefault()
    if len(pgcfg) > 0 {
        // TODO: merge pgcfg with default vaules
        cfg = &pgcfg[0]
    }

    dcfg, err := pgx.ParseConfig(cfg.String())
    if err != nil {
        log.S(ctx).Fatalf("unable to parse postgres connection string: %v", err.Error())
    }

    conf := otelsql.OpenDB(stdlib.GetConnector(*dcfg))

    conn, err := sqlx.NewDb(conf, "pgx"), nil
    if err != nil {
        log.S(ctx).Fatalf("socket server unexpected error %v", err)
    }

    return conn
}

(Note that I can't use otelsqlx [uptrace] because I doesn't support OpenDB using an existing connector, this is why I do sqlx.NewDb to create sqlx connection)

And in my repositories I do something like:

func NewPgProjectsRepository(conn *sqlx.DB) *PgProjectsRepository {
    r := &PgProjectsRepository{
        conn:   conn,
        db:     db.New(conn),
    }

    return r
}

P.S

I'm 100% down for replacing sqlx with pgx but not sure it covers its functionalities with parsing to structs or don't know how to convert pgx.Conn to stdlib to "downgrade".

In one of my attempts I tried using "AquireConn"/conn.Raw on stdlib but it kinda of a deadend since I dont know when to release a connection and want it to be used like I do with database/sql conn

treuherz commented 2 years ago

The crucial thing about any hooks-based solution as I see it is ensuring consistency of the context sent through. OpenTelemetry, and I believe its predecessor frameworks, all work by stuffing the span and correlation information into the ctx. Pgx is already great about making sure a passed-in context is preserved, but for these hooks to be able to correlate calls within the path of a single request, the right context needs to be passed in and out of the hooks in the right order.

If a bit more background on what information is needed is helpful, the OpenTelemetry spec provides a list of keys for traces on database calls. Obviously these would actually be populated by an otel integration library, but hopefully this is a helpful indication of what info the hooks will want to be able to collect, beyond the basics like timings and SQL content.

duxing commented 2 years ago

any update on this topic? similar to many people in the thread I would love to use otel or ddtrace to trace pgx but neither is well supported at the moment. I still prefer pgx over pg but would like to see the improvement on the observability side

jackc commented 2 years ago

I just pushed a big change to the v5-dev branch (78875bb95ab29bf9b4eb653e6ee0db903155e07c). I figured tracing would show up some time and I really didn't want to have both tracing and logging interfaces. So I've implemented a tracing interface and created an adapter that converts normal loggers to the tracing interface.

There are various interfaces a tracer can implement. e.g.

// QueryTracer traces Query, QueryRow, and Exec.
type QueryTracer interface {
    // TraceQueryStart is called at the beginning of Query, QueryRow, and Exec calls. The returned context is used for the
    // rest of the call and will be passed to TraceQueryEnd.
    TraceQueryStart(ctx context.Context, conn *Conn, data TraceQueryStartData) context.Context

    TraceQueryEnd(ctx context.Context, conn *Conn, data TraceQueryEndData)
}

As I mentioned in my original post, the simple logging that pgx supports in v4 was enough for my needs at the time. The tracelog package adapts old loggers to the new tracing interface. This works well enough for me. But as far as tracing per se, I still don't need that. So my design was based on the discussion in the thread and general experience -- not first hand need or specific experience.

This is the last major change planned for v5. I plan to release the (hopefully) final alpha release within the next few days (maybe even tonight), then a beta release a couple weeks after that, and the general release a few weeks after that.

All that to say, I'd suggest those who care about this feature trying it soon. Once v5 is released the interface will be frozen for a long time.

obitech commented 2 years ago

I've tried this out locally and I have no complaints so far regarding the API. Thank you @jackc!

Anyone who wants to give this a spin, I've created https://github.com/exaring/otelpgx based on the v5.0.0-alpha.5 tag.

jnst commented 2 years ago

@jackc I would like to implement Datadog tracing, but at the same time I need the log output of the newly provided tracelog package.

Copy and paste code from the tracelog package when implementing the QueryTracer interface is redundant. Shouldn't ConnConfig.Tracer be of type slice?

Or I think Logger and Tracer should exist separately.

jackc commented 2 years ago

@jnst I think the tracer implementation wrapping another tracer (middleware style) makes more sense than a slice of tracers. That gives application code the ability to control which tracers run and in what order.

jnst commented 2 years ago

@jackc The current design has the following problems.

Log output is practically always used, so tracelog must be wrapped and implemented.
Wrapping and implementing code can be complex and violates the principle of single responsibility.
It is even more complicated in cases where three or four tracers are needed.
Unit tests are also complicated by the need to use mock.

The practice of executing operations one at a time in a slice structure is often seen in server application interceptors.

https://dev.to/karankumarshreds/middlewares-in-go-41j#chaining-middlewares

I am using Datadog in my product and am trying to support pgx to visualize database processing times in a distributed tracing. I have actually implemented the following code. It seems a bit verbose and I would appreciate if you could improve it.

func (t *pgxTracer) TraceQueryStart(ctx context.Context, conn *pgx.Conn, data pgx.TraceQueryStartData) context.Context {
    ctx = t.logTracer.TraceQueryStart(ctx, conn, data)

    var c *pgx.ConnConfig
    if conn != nil {
        c = conn.Config()
    }
    span := t.startSpan(ctx, "query", data.SQL, c)
    if span != nil {
        return tracer.ContextWithSpan(ctx, span)
    }
    return ctx
}

Click here for the entire code.

https://gist.github.com/jnst/7c2095c6e149496f55128e09a1b0bed3

obitech commented 2 years ago

To come to @jackc's support, I don't see how these are problems to be honest.

Log output is practically always used, so tracelog must be wrapped and implemented.

This might be true in your case, but loads of people don't need logs.

Wrapping and implementing code can be complex and violates the principle of single responsibility.

Wrapping middlewares is a standard procedure and used in many popular router implementations in Go.

Unit tests are also complicated by the need to use mock.

This is an implementation detail and I don't see how a slice of tracers would simplify this.

Also you link to a blog post, but this is just one opinion. If you look at a hugely popular post on HTTP services, wrapping is encouraged. Also chi, one of the most popular HTTP routers in the Go ecosystem is using wrapping middleware handlers.

I don't have stakes in this as I'm just a (happy) user of pgx, but I understand why the proposed Tracer interfaces have been chosen: they keep the pgx code simple while still enabling lots of use cases for its users. It should not be pgx's task to make your code less verbose.

jackc commented 2 years ago

@jnst

Multiple tracers can be implemented in several ways. I happen to prefer the wrapping style like the HTTP middleware mentioned above. But a slice of tracers can itself be easily implemented as a Tracer. Something like this should do what you want:

type MultiQueryTracer struct {
    Tracers []QueryTracer
}

func (m *MultiQueryTracer) TraceQueryStart(ctx context.Context, conn *Conn, data TraceQueryStartData) context.Context {
    for _, t := range m.Tracers {
        ctx = t.TraceQueryStart(ctx, conn, data)
    }

    return ctx
}

func (m *MultiQueryTracer) TraceQueryEnd(ctx context.Context, conn *Conn, data TraceQueryEndData) {
    for _, t := range m.Tracers {
        t.TraceQueryEnd(ctx, conn, data)
    }
}

mrkagelui commented 2 years ago

is anyone working on datadog integration? if not I'll try to (FYI - https://github.com/DataDog/dd-trace-go/issues/697#issuecomment-1287813687)

egon12 commented 2 years ago

Hi @jackc, I try to create the Trace for newrelic right now.

And try to unit test it. For now function.

cmd, err := conn.Exec(ctx, query, args...)

It doesn't have any problem.

But for

rows, err := conn.Query(ctx, query, args...)

I can't see where is TraceQueryEnd is called. I try to find it in Query function here https://github.com/jackc/pgx/blob/3e825ec8982fb9531c7052281bfb6cb206507413/conn.go#L624

And not found TraceQueryEnd in this function. Is it expected?

egon12 commented 2 years ago

After more reading on the code, I found out that TraceQueryEnd is called when we call rows.Close(). Sorry for asking unnecessary question.

mrkagelui commented 2 years ago

Hi @jackc, would it be possible to get your opinion (or anyone else's on this thread) on https://github.com/DataDog/dd-trace-go/pull/1537 ?

mrkagelui commented 2 years ago

latest update, as dd-trace-go supports three versions back (1.17 as of now), that integration has to wait, but still could anyone please comment on the API as well as data collected?

jackc commented 2 years ago

@mrkagelui I haven't used used DataDog, so my opinion probably isn't super valuable, but I took a quick look and nothing in the pgx parts looked out of the ordinary.

jackc commented 2 years ago

I'm going to close this issue since the feature was merged into v5 and v5 has now been released. For any bugs or new features we can create new issues or discussions.

mrkagelui commented 2 years ago

@mrkagelui I haven't used used DataDog, so my opinion probably isn't super valuable, but I took a quick look and nothing in the pgx parts looked out of the ordinary.

Thank you so much! Yes I wanted to see if the data collected makes sense and it's the correct use of the interface and data

zaydek commented 3 days ago

If anyone's struggling with this, I threw this together based on a comment in this issue and another (which are in the code file). This uses the slog logger interface.

https://gist.github.com/zaydek/91f27cdd35c6240701f81415c3ba7c07

jackc / pgx

How to improve logging and tracing support in future release? #1061