petabridge / phobos-issues

Public issues and bug tracker for Phobos®
https://phobos.petabridge.com/
2 stars 1 forks source link

DataDog actor monitoring that was working with Phobos.Actor 1.2.5 now not working after upgrading to 1.4.2 #53

Closed andrew-lewell closed 2 years ago

andrew-lewell commented 2 years ago

Hi, I have set up DataDog monitors to trigger alerts for actor restarts and it was working fine with 1.2.5 but after upgrading Phobos.Actor to 1.4.2 it does not seem to trigger them anymore. Any ideas what could be the issue here? Has something changed around the Akka.NET.akka.actor.restarts.counter.value?

Aaronontheweb commented 2 years ago

Let me take a look @andrew-lewell - which version of App.Metrics are you using for DataDog?

Aaronontheweb commented 2 years ago

Off the top of my head, the only Phobos-specific thing I can think of that may have had an impact was when we added actor lifecycle tracing in 1.4 - but that change shouldn't have affected any metrics around actor restarts: https://phobos.petabridge.com/articles/releases/RELEASE_NOTES.html#140httpssdkbincompublisherpetabridgeproductphobospackagesphobosactorversions140-january-19th-2022

Aaronontheweb commented 2 years ago

Looks like Phobos 1.4 depends on App.Metrics 4.3.0 https://sdkbin.com/publisher/petabridge/product/phobos/packages/phobos.monitoring/versions/1.4.2

https://sdkbin.com/publisher/petabridge/product/phobos/packages/phobos.monitoring/versions/1.2.5 - 1.2.5 used App.Metrics 4.2.0

There were a number of changes to the StatsD packages between 4.2.0 and 4.3.0 - are you using Phobos with the App.Metrics support for DogStatsD by any chance?

https://github.com/AppMetrics/AppMetrics/commits/features/4.3.0

Aaronontheweb commented 2 years ago

Could possibly be related to https://github.com/AppMetrics/AppMetrics/issues/682 - but we'd need to know which versions of App.Metrics' packages for DataDog you're using

andrew-lewell commented 2 years ago

I am using App.Metrics 4.3.0.

I'll take another look at the metric name, perhaps it has changed. Thanks for the tip.

Aaronontheweb commented 2 years ago

I am using App.Metrics 4.3.0.

I'll take another look at the metric name, perhaps it has changed. Thanks for the tip.

Did you upgrade from an earlier version of App.Metrics as part of this? If that's the case then the metric name is probably the best guess.

jackowild commented 2 years ago

Hi Aaron, we've found a few more interesting things on this today. The problem we are having is our custom app metric tags aren't being included with the metric e.g. in our ConfigureServices method in our Startup class we have

            services.AddMetrics(builder =>
            {
                var metrics = builder.Configuration.Configure(o =>
                {
                    o.GlobalTags.Add("service", AppDomain.CurrentDomain.FriendlyName);
                    o.GlobalTags.Add("host", Dns.GetHostName());
                    o.GlobalTags.Add("env", Environment.UserName);
                    o.GlobalTags.Add("environment", Environment.UserName);
                    o.GlobalTags.Add("version", version);
                    o.DefaultContextLabel = AppDomain.CurrentDomain.FriendlyName;
                    o.Enabled = true;
                    o.ReportingEnabled = true;
                })
            ....

In datadog when looking at the metric, we only see the host tag along with the actortype and exception tags from Phobos. We're no longer seeing our other tags like service, env.

We've found that this only happens in version 1.4.1 and later of Phobos, it works fine with 1.4.0. Looking at the release notes, we can't see anything that would affect this, do you know of anything?

Aaronontheweb commented 2 years ago

@jackowild let me take a look and see if we did anything - because that is very odd. We haven't really touched the metrics code that much in the 1.x branch recently.

Aaronontheweb commented 2 years ago

Welp, I found the problem @jackowild - from the internals of Phobos.Metrics:

image

We fixed a small NullReferenceException that could happen with App.Metrics but apparently did it in such a way that overwrote all GlobalTags previously set by the user. This happened in Phobos 1.4.1.

We're about to ship Phobos 2.0 and 1.5 today - I'll add a regression test for 1.5 for this use case and fix it there before we publish.

jackowild commented 2 years ago

:D that will be it! Thanks for the quick response, we'll look out for 1.5 + 2.0

Aaronontheweb commented 2 years ago

This is now resolved in Phobos 1.5.1 https://sdkbin.com/publisher/petabridge/product/phobos/packages/phobos.actor/versions/1.5.1