newrelic / go-agent

New Relic Go Agent
Apache License 2.0
762 stars 297 forks source link

WrapBackgroundCore background logger is not sending logs to new relic but transaction logger is #859

Closed yehudamakarov closed 5 months ago

yehudamakarov commented 5 months ago

Description

following the docs at https://docs.newrelic.com/docs/logs/logs-context/configure-logs-context-go/#4-zap

backgroundCore, err := nrzap.WrapBackgroundCore(core, app)
if err != nil && err != nrzap.ErrNilApp {
    panic(err)
}

backgroundLogger := zap.New(backgroundCore)
txn := app.StartTransaction("nrzap example transaction")
txnCore, err := nrzap.WrapTransactionCore(core, txn)
if err != nil && err != nrzap.ErrNilTxn {
    panic(err)
}

txnLogger := zap.New(txnCore)
yehudamakarov commented 5 months ago

my actual code, and this isn't working:

func Logger(app *newrelic.Application) (*zap.Logger, *zapcore.Core, error) {
    core := zapcore.NewCore(zapcore.NewJSONEncoder(zap.NewProductionEncoderConfig()), zapcore.AddSync(os.Stdout), zap.InfoLevel)

    backgroundCore, err := nrzap.WrapBackgroundCore(core, app)
    if err != nil && err != nrzap.ErrNilApp {
        panic(err)
    }

    l, err := zap.New(backgroundCore), nil
    // todo where is this message?
    l.Info("background logs are working")
    return l, &core, err
}

but here, the messages arrive

            txn := newrelic.FromContext(ctx)
            if txn == nil {
                txn = app.StartTransaction(req.Spec().Procedure)
            } else {
                txn.SetName(req.Spec().Procedure)
            }

            txnCore, err := nrzap.WrapTransactionCore(*core, txn)
            if err != nil && err != nrzap.ErrNilTxn {
                panic(err)
            }

            txnLogger := zap.New(txnCore)

intermittently I see 1 single message from the original backgroundLogger

yehudamakarov commented 5 months ago
iamemilio commented 5 months ago

hmm, let me take a look at this and dig into it a bit. Thanks for reporting!

iamemilio commented 5 months ago

So, I built this reproducer based on your code:

func TestBackgroundLoggerSugared(t *testing.T) {
    app := integrationsupport.NewTestApp(integrationsupport.SampleEverythingReplyFn,
        newrelic.ConfigAppLogDecoratingEnabled(true),
        newrelic.ConfigAppLogForwardingEnabled(true),
    )

    core := zapcore.NewCore(zapcore.NewJSONEncoder(zap.NewProductionEncoderConfig()), zapcore.AddSync(os.Stdout), zap.InfoLevel)

    backgroundCore, err := WrapBackgroundCore(core, app.Application)
    if err != nil && err != ErrNilApp {
        t.Fatal(err)
    }

    logger := zap.New(backgroundCore).Sugar()

    err = errors.New("this is a test error")
    msg := "this is a test error message"

    // for background logging:
    logger.Error(msg, zap.Error(err), zap.String("test-key", "test-val"))
    logger.Sync()

    app.ExpectLogEvents(t, []internal.WantLog{
        {
            Severity:  zap.ErrorLevel.String(),
            Message:   msg,
            Timestamp: internal.MatchAnyUnixMilli,
        },
    })
}

Output:

{"level":"error","ts":1709919339.6399841,"msg":"this is a test error message{error 26 0  this is a test error} {test-key 15 0 test-val <nil>}"}

I do see that in my test function, this log message gets captured by the agent.

got: "this is a test error message{error 26 0  this is a test error} {test-key 15 0 test-val <nil>}"

I am wondering if maybe I am missing something? Could you provide any corrections for the test that you think would cause the error to occur?

iamemilio commented 5 months ago

@yehudamakarov

yehudamakarov commented 5 months ago

I don't know I'll have to check out what a possible difference might be.

There's probably more factors to consider on my end.

yehudamakarov commented 5 months ago

But that first message in my example does not arrive and neither do any from the background logger

yehudamakarov commented 5 months ago

Is it possible it has to do with the logger syncing? Because I even sync it at some point in the startup code (before the tx takes over for requests)

iamemilio commented 5 months ago

hmm, yeah possibly. I don't really have a good idea of how the zap logging buffer gets written, but in the example code I ran I force the buffer to flush by calling Sync(), and that seemed to guarantee capture and writes of those logs.

iamemilio commented 5 months ago

Ah! I know whats happening. Are you waiting for your application to connect to new relic? If not, then its attempting to capture the logs, but failing. We insert our instrumentation into the zapcore object, and anytime a zap process invokes the Write() method, we capture the log in the agent, even if its not synced/flushed yet. But if the agent has not completed its connection handshake, it may drop the logs.

quick fix:

if err := app.WaitForConnection(5 * time.Second); nil != err {
    fmt.Println(err)
}
yehudamakarov commented 5 months ago

@iamemilio this looks very promising, let me try this and get back to you. Can't thank you enough for this! Opening the gates of hope

yehudamakarov commented 5 months ago

@iamemilio working as expected, you are correct!

let's get this in the docs asap! whoever comes next will probably have it way better than me. thanks a ton for this help.