newrelic / go-agent

New Relic Go Agent
Apache License 2.0
763 stars 297 forks source link

nrpgx5 data-race #855

Closed UnAfraid closed 4 months ago

UnAfraid commented 5 months ago

Description

nrpgx5 tracer causes data-race

During integration test with testcontainers we observed data races even when new relic application is not initialized

cfg, err := pgxpool.ParseConfig("postgres://postgres:postgres@localhost:5432")
if err != nil {
    panic(err)
}

cfg.ConnConfig.Tracer = nrpgx5.NewTracer()
db, err := pgxpool.NewWithConfig(context.Background(), cfg)
if err != nil {
    panic(err)
}
==================
WARNING: DATA RACE
Write at 0x00c000504600 by goroutine 666:
  github.com/newrelic/go-agent/v3/integrations/nrpgx5.(*Tracer).TraceConnectStart()
      github.com/newrelic/go-agent/v3/integrations/nrpgx5/nrpgx5.go:122 +0xdc
  github.com/jackc/pgx/v5.connect()
      github.com/jackc/pgx/v5/conn.go:224 +0xa0
  github.com/jackc/pgx/v5.ConnectConfig()
      github.com/jackc/pgx/v5/conn.go:141 +0x114
  github.com/jackc/pgx/v5/pgxpool.NewWithConfig.func1()
      github.com/jackc/pgx/v5/pgxpool/pool.go:216 +0x1e0
  github.com/jackc/puddle/v2.(*Pool[go.shape.*uint8]).initResourceValue.func1()
      github.com/jackc/puddle/v2/pool.go:409 +0x130
iamemilio commented 5 months ago

@UnAfraid can you confirm that this only happens when using nrpgx5, and not just with pgx5? Is this code block a reproducer?

cfg, err := pgxpool.ParseConfig("postgres://postgres:postgres@localhost:5432")
if err != nil {
    panic(err)
}

cfg.ConnConfig.Tracer = nrpgx5.NewTracer()
db, err := pgxpool.NewWithConfig(context.Background(), cfg)
if err != nil {
    panic(err)
}
UnAfraid commented 5 months ago

I have an integration test working with pgx5 for quite a while now and it works fine The test uses testcontainers postgres 15

  1. awaits for the database to come up
  2. runs migrations with go-migrate (also using pgx5 driver)
  3. Then runs some queries in transaction like:
    • TX(select)
    • TX(create -> select)
    • TX(select for update -> update -> select)
    • TX(select for update -> delete -> select)

The data race started happening as soon as i added cfg.ConnConfig.Tracer = nrpgx5.NewTracer()

iamemilio commented 5 months ago

Thanks! We will spend some time diving into this then.

WillAbides commented 5 months ago

I ran into the same issue. It happens because when you follow the example in this repo and assign the tracer with cfg.ConnConfig.Tracer = nrpgx5.NewTracer(), then every connection in the pool gets the same tracer instance. As you've seen this tracer isn't concurrency safe because it doesn't guard against concurrent access to BaseSegment

I got around this by using cfg.BeforeConnect to set the tracer before connecting. Here's what my code looks like:

func NewPgxPool(ctx context.Context, dbURL string) (*pgxpool.Pool, error) {
    cfg, err := pgxpool.ParseConfig(dbURL)
    if err != nil {
        return nil, err
    }

    cfg.BeforeConnect = func(_ context.Context, config *pgx.ConnConfig) error {
        config.Tracer = nrpgx5.NewTracer()
        return nil
    }

    return pgxpool.NewWithConfig(ctx, cfg)
}
iamemilio commented 5 months ago

Thanks @WillAbides. This seems to be in line with how the documentation recommends using pgx5 pool in threaded contexts. We will update the documentation on our end and credit you in the commit!