microsoft / ApplicationInsights-Kubernetes

Enrich the telemetry data for .NET applications running inside containers that are managed by Kubernetes.
Other
135 stars 57 forks source link

Upgrade of `KubernetesClient` dependency #357

Closed tomachristian closed 3 weeks ago

tomachristian commented 5 months ago

Is your feature request related to a problem? Please describe. We are using Microsoft.Orleans.Hosting.Kubernetes 8.0.0 which uses KubernetesClient (>= 12.1.1) whilst latest Microsoft.ApplicationInsights.Kubernetes 6.1.1 is using KubernetesClient (>= 10.0.16 && < 11.0.0). For this reason we cannot use them together. Any plans to upgrade the KubernetesClient dependency?

Describe the solution you'd like That the KubernetesClient be upgraded to the latest version.

Describe alternatives you've considered We couldn't find any besides dropping this library completely.

tomachristian commented 5 months ago

@xiaomi7732 any timeline on when these changes will be released?

xiaomi7732 commented 5 months ago

Hey @tomachristian I will work on a beta package for testing in a couple of days. Will that work for you?

xiaomi7732 commented 5 months ago

Please check this out: https://www.nuget.org/packages/Microsoft.ApplicationInsights.Kubernetes/6.1.2-beta1

nZeus commented 5 months ago

Thank you @xiaomi7732 ! 👍

dsempi commented 3 months ago

@xiaomi7732: Can we reopen the issue? There were breaking changes between KubernetesClient 10 and 11. Specifically, KubernetesClient.Models and KubernetesClient.Basic are no longer separate and are now part of the KubernetesClient assembly itself. Could we please have a version that references the newer version of the KubernetesClient assembly? Thank you!

xiaomi7732 commented 3 months ago

@dsempi, thanks for sharing the details. That's kind of issue that would happen upon losing the dependencies. I could bump up the client to use K8sClient 11 or even 12, but there's still potential problems for 13 and 14 in the future. It looks like we will have to have a strategy to depend on various versions.

One way is to sim-ship multiple versions, the problem is that it will greatly increase the maintains cost.

Maybe a way out is to ship a major version that maps to the k8s client major versions, and then, keep up with the latest?

For example, the current 2.x will keep using 10.x; and then, release a 11.0 to use 11; release a 12.0 to use 12; release a 13.0 to use 13. Moving forward, we will update the dependencies gradually only for 13. And hopefully, when 14 releases, we will be ready to move on to use 14 already? Will that work?

Or, maybe we could always maintain x-1 unless requested. That means, when K8sClient 13 is current, we dependent on >= 12, unless requested by github issue?

What do we think will be a good version to dependent on at this point?

samuelcadieux commented 3 months ago

Hey, just chiming in since I was about to report the same issue.

I was thinking about a possible way to solve this and the only thing I could come up with would be to I define a contract in this library (e.g. IKubernetesClient). And then have a separate library that would implement the contract (e.g. Microsoft.ApplicationInsights.KubernetesClient).

Then you could add a template argument for AddApplicationInsightsKubernetesEnricher that has to extend IKubernetesClient:

services.AddApplicationInsightsKubernetesEnricher<ApplicationInsights.KubernetesClient.V12>(...)

The goal would be to benefit from fixes in this library regardless of vesion of KubernetesClient.

Beside the root issue, this has shed some lights on another issue with the while loop in K8sInfoBootstrap. More details in the bug report I was drafting below:


Describe the bug Running in AKS with a k8s cluster on version 1.27.7.

In the process of migrating from dotnet 7 to dotnet 8 we had to upgrade Microsoft.ApplicationInsights.Kubernetes to 6.1.2 and we've attempted to update KubernetesClient to 11 as it's recommended by the version compatbility table.

However upgrading KubernetesClient to 11 but it lead to our service to getting stuck in a very undescription manner.

After upgrading, we noticed our service would hang during startup, with this callstack:

   at Microsoft.AspNetCore.Server.Kestrel.Core.KestrelServerImpl.BindAsync(CancellationToken cancellationToken)
   at Microsoft.AspNetCore.Server.Kestrel.Core.KestrelServerImpl.StartAsync[TContext](IHttpApplication`1 application, CancellationToken cancellationToken)
   at Microsoft.AspNetCore.Hosting.GenericWebHostService.StartAsync(CancellationToken cancellationToken)
   at Microsoft.Extensions.Hosting.Internal.Host.<StartAsync>b__15_1(IHostedService service, CancellationToken token)
   at Microsoft.Extensions.Hosting.Internal.Host.ForeachService[T](IEnumerable`1 services, CancellationToken token, Boolean concurrent, Boolean abortOnFirstException, List`1 exceptions, Func`3 operation)
   at Microsoft.Extensions.Hosting.Internal.Host.StartAsync(CancellationToken cancellationToken)
   at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.RunAsync(IHost host, CancellationToken token)
   at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.RunAsync(IHost host, CancellationToken token)
   at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.Run(IHost host)

After adding some logging, we found that the last Background Service starting before the hang was K8sInfoBackgroundService.

I suspect the hang is due to:

    protected override Task ExecuteAsync(CancellationToken stoppingToken)
        => _k8SInfoBootstrap.ExecuteAsync(stoppingToken);

This has the potential to execute the chain of function call that create the task synchronously until the first await is encoutered. To guard against this situation we might want to consider wrapping this call in Task.Run(() => ...) or adding an await Task.Yield() in the Task to ensure it's not blocking the startup.

As of the root cause of the issue, it is due to this loop: https://github.com/microsoft/ApplicationInsights-Kubernetes/blob/beb14d06249bee81c433344d55414e5ad5a309ae/src/ApplicationInsights.Kubernetes/K8sInfoBootstrap.cs#L53

This while loop essentially performs a spinlock logging those messages over-and-over:

[Debug] [2024-03-20T15:52:44.5772198Z] Starting update K8sEnvironment
[Error] [2024-03-20T15:52:44.5773839Z] Unexpected error happened. Telemetry enhancement with Kubernetes info won't happen. Message: Could not load file or assembly 'KubernetesClient.Models, Version=10.0.0.0, Culture=neutral, PublicKeyToken=null'. The system cannot find the file specified.
[Trace] [2024-03-20T15:52:44.5774471Z] Unexpected error happened. Telemetry enhancement with Kubernetes info won't happen. Details: System.IO.FileNotFoundException: Could not load file or assembly 'KubernetesClient.Models, Version=10.0.0.0, Culture=neutral, PublicKeyToken=null'. The system cannot find the file specified.
File name: 'KubernetesClient.Models, Version=10.0.0.0, Culture=neutral, PublicKeyToken=null'
   at Microsoft.ApplicationInsights.Kubernetes.K8sEnvironmentFactory.CreateAsync(CancellationToken cancellationToken)
   at Microsoft.ApplicationInsights.Kubernetes.K8sEnvironmentFetcher.UpdateK8sEnvironmentAsync(CancellationToken cancellationToken)
   at Microsoft.ApplicationInsights.Kubernetes.K8sInfoBootstrap.ExecuteAsync(CancellationToken cancellationToken)

So it seems like Microsoft.ApplicationInsights.Kubernetes is hardwired to work with KubernetesClient 10.x.

We have tried enabling AutoGenerateBindingRedirects on our service but this didn't work.

To solve our issue we had to roll back our version of KubernetesClient.

We were considering doing a PR with similar changes that was proposed originally in https://github.com/microsoft/ApplicationInsights-Kubernetes/pull/358 originally. But this would break backward compatbility as you've pointed out.

Package Versions Application Insights Kubernetes Version: 6.1.2 Application Insights SDK Version: 2.22.0

Expected behavior

dsempi commented 3 months ago

@xiaomi7732: The versioning strategy, wherein KubernetesClient versions are mapped to ApplicationInsights.Kubernetes versions, appears feasible. However, there's a notable drawback: users who must stick with an older KubernetesClient version may miss out on the latest features/fixes of ApplicationInsights.Kubernetes. Upon examining the usage of KubernetesClient within the ApplicationInsights.Kubernetes codebase, it's evident that the dependency isn't overly tight. Consequently, the impact of a breaking change in KubernetesClient (as occurred between versions 10 and 11) is relatively low. A proposed versioning approach could be as follows: Version 6.1.1 remains dependent on KubernetesClient versions <10, 11). Subsequently, a version 6.2.0 could be produced, which would support KubernetesClient versions <11, 14). Upon the release of a new KubernetesClient version 14, compatibility checks can be conducted. If the compatibility is confirmed, Version 6.2.1 could be created, with support for KubernetesClient versions <11, 15). If not, Version 6.3.0 could be developed, with support for KubernetesClient <14, 15). This approach aims to balance maintenance, compatibility and access to the latest features/fixes for users of ApplicationInsights.Kubernetes. What are your thoughts on this strategy?

xiaomi7732 commented 3 months ago

@dsempi Yes, there's going to be drawback for existing users. And I appreciate you observed it that we don't have an over tight dependency on K8s client. I like your proposal, it looks like a balanced compromise. But I want to explorer a little bit on @samuelcadieux idea as well.

@samuelcadieux firstly, could you please file the other issue separately? With regarding the versioning, will your approach require reference to multiple versions of K8sClient at the same time? Is there any known project uses something like that? Or do I misunderstand your proposal?

xiaomi7732 commented 3 months ago

Add a bit background for a project to reference multiple versions of the same NuGet package, it doesn't seem the mechanism would just work: https://github.com/NuGet/Home/issues/12400

xiaomi7732 commented 3 months ago

More investigation result: The current code will keep working until K8sClient.12.0.16. It will start to break for versions above 12.1.1.

The current consideration is to split the versions like this:

Application Insights Kubernetes Kubernetes Client Remarks
6.1.3 [10.0.16, 12.1.0) No feature work
7.0.0 >=12.1.1 Until another compatibility issue shows up.

I'll still need to verify that there's no break change >= 12.1.1.

xiaomi7732 commented 3 months ago
xiaomi7732 commented 3 months ago

I am about to move forward with those PRs, any push back? Thanks!

dsempi commented 3 months ago

@xiaomi7732: Looks good to me. Could we get the release of 7.0.0 in the same time as 6.1.3?

xiaomi7732 commented 3 months ago

@dsempi That's exactly what I am planning to do. I am also figuring out where to put the information about which version to pick for the end user.

dsempi commented 3 months ago

@xiaomi7732: I guess, a small note in README.md should be sufficient. The NuGet itself will have different version of KubernetesClient as a dependency, so a consumer should know which one to pick.

xiaomi7732 commented 3 months ago

Please find out the beta packages:

xiaomi7732 commented 3 weeks ago

Stable versions released:

I'll close this issue. Feel free to file new ones if there's anything else.