microsoft / ApplicationInsights-dotnet

ApplicationInsights-dotnet
MIT License
565 stars 287 forks source link

Telemetry is sent for Console application even when instrumentation key wasn't set #1652

Closed bryancrosby closed 4 years ago

bryancrosby commented 4 years ago

I recently installed the Application Insights Status monitor on a cloud-hosted VM. I wanted to see some detailed SQL command text from a console application, using the custom SDK. This console application can be run with different command line arguments to perform different tasks. I have one particular instance of this app to be monitored with App Insights SDK -- I created a custom telemetry initializer to filter stuff out. If the command line argument is a certain one, I enable telemetry. If not, then I don't even set the key.

Relevant code sample of the setup of the client:

if (!string.IsNullOrWhiteSpace(key) && tasksToInstrumentWhitelist.Any())
{
    TelemetryConfiguration.Active.InstrumentationKey = key;
    //NOTE: The Initializer's purpose is to attach the task name to each telemetry event
    TelemetryConfiguration.Active.TelemetryInitializers.Add(new ApplicationInsightsTaskInitializer(
        "mycustomtask"));
    var telemetryProcessorBuilder = TelemetryConfiguration.Active.TelemetryProcessorChainBuilder;
    telemetryProcessorBuilder.Use(next => new TaskWhitelistTelemetryProcessor(next, tasksToInstrumentWhitelist));
    telemetryProcessorBuilder.Build();
    telemetryClient = new TelemetryClient(TelemetryConfiguration.Active);
}

//Attach a task name to each event
public class ApplicationInsightsTaskInitializer : ITelemetryInitializer
{
    readonly string taskName;

    public ApplicationInsightsTaskInitializer(string taskName)
    {
        this.taskName = taskName;
    }

    public void Initialize(ITelemetry telemetry)
    {
        if (!telemetry.Context.GlobalProperties.ContainsKey(taskName))
        {
            telemetry.Context.GlobalProperties["TaskName"] = taskName;
        }
    }
}

//Only send data for the whitelisted tasks
public class TaskWhitelistTelemetryProcessor : ITelemetryProcessor
{
    private readonly ITelemetryProcessor next;
    private readonly HashSet<string> tasksToIncludeTelemetry;

    public TaskWhitelistTelemetryProcessor(ITelemetryProcessor next, HashSet<string> tasksToInstrumentWhitelist)
    {
        this.next = next;
        this.tasksToIncludeTelemetry = new HashSet<string>(tasksToInstrumentWhitelist, StringComparer.InvariantCultureIgnoreCase);
    }

    public void Process(ITelemetry item)
    {
        if (item.Context.GlobalProperties.TryGetValue("TaskName", out var taskName) && tasksToIncludeTelemetry.Contains(taskName))
        {
            next.Process(item);
        }
    }
}

I created an Application Insights resource as usual in Azure and set the telemetry configuration to point to this key.

I'm seeing too much data on my subscription -- already over 5 GB -- how did the Application Insights Status monitor know to associate the data to my instrumentation key? I was expecting just a dozen or so SQL and HTTP calls, but I have over 1 million events. I didn't do anything with Status Monitor other than just installing it and restarting the VM.

Any idea how I can whitelist just this one console application?

Repro Steps

Installed Status Monitor agent Deployed console application to VM with a specific telemetry configuration pointed to a specific instrumentation key Implemented custom code to make sure telemetry events are only sent when the custom property matches my whitelist

Actual Behavior

Noticed other telemetry data from the same server, for the same application, but for tasks that were not on my whitelist. These seem to have been picked up automatically without me even setting the telemetry configuration.

Expected Behavior

Was not expecting other application's telemetry data other than the specific application I configured with a specific instrumentation key

Version Info

SDK Version : 2.12.0 .NET Version : targeting net461
How Application was onboarded with SDK(VisualStudio/StatusMonitor/Azure Extension) : OS : Windows Server 2019 Datacenter Hosting Info (IIS/Azure WebApps/ etc) : Console application

Dmitry-Matveev commented 4 years ago

The current code as I read it would set Instrumentation Key for every telemetry if you have at least one Task in tasksToInstrumentWhitelist due to TelemetryConfiguration.Active=key;.

Active is the singleton and the Telemetry Client created from this singleton will have its IKey, any telemetry passing through this client will be stamped with this IKey. AI auto-collection will leverage that singleton as well, so any telemetry AI produces by default will have this IKey.

Second, Telemetry Initializer seems to put "TaskName" = "myCustomTask" for every telemetry item (Initializers will be invoked on every telemetry item passing through the associated Telemetry Configuration). Therefore, each and every item tracked will have custom property with "myCustomTask" in it.

Telemetry Processor then will not drop anything because everything has the white-listed "TaskName" property.

If the intent is to only set a subset of telemetry, you can validate properties of the collected telemetry in the custom processor and drop it if you do not have a match, that part is correct. However, currently, all items seem to have the match you're looking for due to the way custom initializer is invoked.

bryancrosby commented 4 years ago

If the intent is to only set a subset of telemetry, you can validate properties of the collected telemetry in the custom processor and drop it if you do not have a match, that part is correct. However, currently, all items seem to have the match you're looking for due to the way custom initializer is invoked.

I suppose I'm confused here. The class TaskWhitelistTelemetryProcessor says that the TaskName property must exist, and the value that it has must match an item on the whitelist in order for the telemetry to be sent.

if (item.Context.GlobalProperties.TryGetValue("TaskName", out var taskName) && tasksToIncludeTelemetry.Contains(taskName))

What I think is more strange is that I'm getting the expected behavior if I uninstall the Status Monitor agent from the machine. Only telemetry events from the whitelist make it up the Application Insights.

How does this agent somehow discover that my app is running and start sending events? I noticed the data had arrived before I even added application insights to my application.

Dmitry-Matveev commented 4 years ago

Status Monitor installs AI SDK into the process if it does not have one and starts to auto-collect telemetry with the usual AI collections modules for Request / Depedency and so on. You dot really need it if you already have SDK in your web application. However, if your app is console application (like the title suggests) Status Monitor will not do anything at all - it only supports IIS-deployed apps.

What I meant is that the condition you have will always be true: if (item.Context.GlobalProperties.TryGetValue("TaskName", out var taskName) && tasksToIncludeTelemetry.Contains(taskName)) is always true if each telemetry item has a proper task name on it. Each telemetry item will indeed have the right task name on it because Initializer will add "myCustomTask" for each item: //Attach a task name to each event public class ApplicationInsightsTaskInitializer : ITelemetryInitializer {

At least that's the way I read the code initially.

bryancrosby commented 4 years ago

Status Monitor installs AI SDK into the process if it does not have one and starts to auto-collect telemetry with the usual AI collections modules for Request / Depedency and so on. You dot really need it if you already have SDK in your web application. However, if your app is console application (like the title suggests) Status Monitor will not do anything at all - it only supports IIS-deployed apps.

What I meant is that the condition you have will always be true: if (item.Context.GlobalProperties.TryGetValue("TaskName", out var taskName) && tasksToIncludeTelemetry.Contains(taskName)) is always true if each telemetry item has a proper task name on it. Each telemetry item will indeed have the right task name on it because Initializer will add "myCustomTask" for each item: //Attach a task name to each event public class ApplicationInsightsTaskInitializer : ITelemetryInitializer {

At least that's the way I read the code initially.

Understood. So it should be filtering them out. But it seems like somehow Status Monitor starts sniffing the process for instances of this console application.

The only reason I added the Status Monitor was to gather the SQL command text for SQL queries. According to the docs I read on MSDN, this was the recommended thing to do. Is this document still the recommended way of capturing this? Current constraints are a .NET Framework 4.6.1 hosted on a cloud VM instance.

https://docs.microsoft.com/en-us/azure/azure-monitor/app/asp-net-dependencies#advanced-sql-tracking-to-get-full-sql-query

Dmitry-Matveev commented 4 years ago

For full .NET Status Monitor is indeed a way to collect Sql query (on latest .NET core, SM is not needed and SQL command can be collected with SDK), but by default it will only attach a profiling piece to the ASP.NET applications hosted in IIS. It would not know about the console apps and it won't show them in its UI to enable for monitoring.

You can manually hook up the profiling piece that Status Monitor drops onto the machine into any .NET app, but that would require playing with profiling environment variables which I do not think you did in that case.

Are there some apps in IIS web server on that same machine? Status Monitor may onboard those to AI and you can get telemetry from them if Instrumentation Key is somehow propagated into those apps (e.g. environment variable APPINSIGHTS_INSTRUMENTATIONKEY)

bryancrosby commented 4 years ago

The overarching goal was to just try and get some SQL command text into App Insights, as well as the usual stuff that comes out of the box (HTTP, etc.). There are currently no other apps installed on the machine. IIS is not even installed as this is not a web application. Unfortunately, this app cannot be converted to .NET Core just quite yet.

Is there a document to set up the profiling variables so that console apps may send this data over? Everything else seems to work flawlessly out of the box, except for the SQL command text.

Dmitry-Matveev commented 4 years ago

Not really, it's not officially supported :)

I may hint though, that you may try to mimic what Status Monitor does to W3SVC registry key (Environment value under HKLM\System\CurrentControlSet\Services\W3SVC) and setup the same environment variables for your application before it starts (e.g. start from command line that has these variables set).

At your own risk with no support, though.

bryancrosby commented 4 years ago

Closing this as it's not technically supported