evanderkoogh / otel-cf-workers

An OpenTelemetry compatible library for instrumenting and exporting traces for Cloudflare Workers
BSD 3-Clause "New" or "Revised" License
239 stars 50 forks source link

TypeError in tracer.ts when getActiveConfig() returns undefined #14

Closed palp closed 1 year ago

palp commented 1 year ago

I have not attempted a simple repro so I don't know the root cause of this, but this error occurs in my worker reliably:

Uncaught (in promise) TypeError: Cannot read properties of undefined (reading 'sampling')                                                                                                                               

  const sampler = getActiveConfig().sampling.headSampler                                                                                                                                                                          
                                    ^
      at startSpan (src/tracer.ts:47:36)
      at startActiveSpan (src/tracer.ts:88:20)
      at apply (../src/instrumentation/do-storage.ts:72:17)
      at proxyHandler.apply (../src/instrumentation/wrap.ts:27:18)
      at  (C:\Users\palp\AppData\Local\Temp\tmp-35572-Q5UGGZxJVrjs/src/billing.ts:81:39)

I was able to resolve it by modifying getActiveConfig to return a default value instead of undefined, but this may or may not be appropriate/desired behavior. https://github.com/evanderkoogh/otel-cf-workers/compare/main...palp:otel-cf-workers:main

evanderkoogh commented 1 year ago

Hmm.. that is certainly interesting. It shouldn't be possible for getActiveConfig to return undefined. But I just realised that it is possible to do a blockConcurrencyWhile in the constructor. And it might be possible for there not to be a config set.

Do you have that scenario in this particular case?

palp commented 1 year ago

Hmm.. that is certainly interesting. It shouldn't be possible for getActiveConfig to return undefined. But I just realised that it is possible to do a blockConcurrencyWhile in the constructor. And it might be possible for there not to be a config set.

Do you have that scenario in this particular case?

Yep, there is a call to storage inside a blockConcurrencyWhile call which is triggering it.

palp commented 1 year ago

Yes, I have confirmed reproduction with this test class:

import { instrumentDO, ResolveConfigFn } from '@microlabs/otel-cf-workers'

export interface Env {
    OTEL_BASIC_AUTH: string;
    OTEL_URL: string;
}

class TestDO implements DurableObject {
    constructor(state: DurableObjectState, env: Env) {
        state.blockConcurrencyWhile(async () => {
            await state.storage.list({ prefix: 'test' })

        })
    }
    async fetch(request: Request) {
        return new Response('Hello world!', { status: 200 })
    }
}

const config: ResolveConfigFn = (env: Env, _trigger) => {
    return {
        exporter: { url: env.OTEL_URL, headers: { 'Authorization': `Basic ${env.OTEL_BASIC_AUTH}` } },
        service: { name: 'test'},
    }
}

const TestOtelDO = instrumentDO(TestDO, config)
export { TestOtelDO }

Note it only fails if the worker calling the DO is also instrumented, otherwise it seems to work as expected.

evanderkoogh commented 1 year ago

Awesome! Thanks for the repro.. I am trying to work out the best way to fix this. Hopefully I'll have it fixed in the next few hours :)

evanderkoogh commented 1 year ago

Ok.. a fix for this should be released as part of 1.0.0-rc3. The only annoying thing is that the constructor has to be in its own trace, because the fetch method hasn't been called and thus we have no idea what trace (or traces even) are waiting for the constructor to finish.

SamyPesse commented 1 year ago

I'm observing this issue when running tests locally with miniflare where at some point getActiveConfig starts returning undefined. I'll continue investigating, I've opened https://github.com/evanderkoogh/otel-cf-workers/issues/20 as a workaround.

EDIT: it looks related to how miniflare re-initializes the DO in between requests.

SamyPesse commented 1 year ago

Also got the error in production in a DurableObject, not sure why but the pattern is globally:

  1. I have a durable Object DOClass
  2. This class creates a new object in its constructor this.something = new SomethingWithStorage(this.state.storage)
  3. Then when handling a WebSocket message, I call this.something.doSOmething() which uses the storage
  4. 💥

Using 1.0.0-rc.6

jrowny commented 6 months ago

Also getting this same error, same exact situation and repro as @SamyPesse