contribsys / faktory

Language-agnostic persistent background job server
https://contribsys.com/faktory/
Other
5.78k stars 230 forks source link

1.7.0 Metrics Name-spacing is Broken #433

Closed profsmallpine closed 1 year ago

profsmallpine commented 1 year ago

We are running enterprise on 1.7.0 and we lost our metrics namespacing. I did notice that Field Client.Namespace is now private, please use the WithNamespace option. according to the v5 docs for datadog. https://github.com/DataDog/datadog-go/blob/master/CHANGELOG.md#500--2021-10-01 I didn't see how this is wired together in faktory, but I assume the issue is related.

mperham commented 1 year ago

I'm not sure I understand the issue. The only place Faktory Enterprise touches the statsd namespace property is like this:

    if ss.Namespace != "" {
        sopts = append(sopts, statsd.WithNamespace(ss.Namespace))
    }
    c, err := statsd.New(ss.Location, sopts...)        

ss is Faktory's statsdSubsystem which has a public Namespace field. Is there any other logging or info available?

profsmallpine commented 1 year ago

When we deployed 1.7.0., we found that our metrics name went from faktory_jobs_perform to jobs_perform in cloudwatch metrics. I saw the datadog bump and assumed it was related but maybe not.

mperham commented 1 year ago

Make sure your namespace doesn't end with ..

profsmallpine commented 1 year ago

OK, I will test this tomorrow. However, we hadn't explicitly set the namespace in the statsd.toml. This change happened after deploying 1.7.0 which was a bummer as now our dashboards are showing the wrong info for last nights workers. If this is intentional, it would be helpful for others on this path for it to be called out in the changelog.

mperham commented 1 year ago

Ah, that's a key insight. I thought you were configuring the namespace. Add it to your statsd.toml:

[statsd]
  namespace = "faktory"

with no trailing dot. It is broken when not explicitly configured, i.e. the default.

profsmallpine commented 1 year ago

Thanks for the prompt response! I'd like to understand why this changed from 1.6.2 -> 1.7.0 though. We made no changes to our statsd.toml between the change, and we found the metrics name did change though. Do you understand why that is?

mperham commented 1 year ago

Yes, because DataDog changed how you pass options to statsd.New from v3 to v5 and my namespace option logic was wrong.