dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.07k stars 4.69k forks source link

[Feature]: StartupHook dependency resolving #66138

Open RassK opened 2 years ago

RassK commented 2 years ago

We are looking for a way in .NET Core to control a dependency versioning. In previous .NET Framework it was easily possible via binding redirects but in .NET Core there is no way at all.

Our library (OpenTelemetry .NET Auto-Instrumentation) is attaching to the application. Unfortunately we cannot get away without dependencies. It's OK to vendor stateless libs but libraries like System.Diagnostics.DiagnosticSource contain a state that needs to be the same [instance] all over the boundaries.

We do use AdditonalDeps, but unfortunately it doesn't have priority over application. When dependency is left undefined, we can upgrade it to the necessary version.

If it raises performance / security concerns, we'd appreciate if it's controllable via environment variable. DevOps can choose to opt in to this feature.

deps


CC: @nrcventura @pjanotti @pellared @zacharycmontoya @rajkumar-rangaraj

dotnet-issue-labeler[bot] commented 2 years ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

jkotas commented 2 years ago

In previous .NET Framework it was easily possible via binding redirects but in .NET Core there is no way at all.

In .NET Framework, you had to edit the application config files to add the binding redirects. Would a solution that requires editing application config files work for you in .NET Core?

RassK commented 2 years ago

In previous .NET Framework it was easily possible via binding redirects but in .NET Core there is no way at all.

In .NET Framework, you had to edit the application config files to add the binding redirects. Would a solution that requires editing application config files work for you in .NET Core?

The way that requires editing existing config file isn't perfect, but definitely one way to do it. We need this to be full devops scenario. So overwriting the config file moves some complexity to a client's CI pipline, which isn't exactly great, as we need to provide info how to do it exactly and there are loads of different CI systems out there.

ghost commented 2 years ago

Tagging subscribers to this area: @vitek-karas, @agocke, @vsadov See info in area-owners.md if you want to be subscribed.

Issue Details
We are looking for a way in .NET Core to control a dependency versioning. In previous .NET Framework it was easily possible via binding redirects but in .NET Core there is no way at all. Our library ([OpenTelemetry .NET Auto-Instrumentation](https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation)) is attaching to the application. Unfortunately we cannot get away without dependencies. It's OK to vendor stateless libs but libraries like `System.Diagnostics.DiagnosticSource` contain a state that needs to be the same [instance] all over the boundaries. We do use [AdditonalDeps](https://github.com/dotnet/runtime/blob/main/docs/design/features/additional-deps.md), but unfortunately it doesn't have priority over application. When dependency is left undefined, we can upgrade it to the necessary version. If it raises performance / security concerns, we'd appreciate if it's controllable via environment variable. DevOps can choose to opt in to this feature. ![deps](https://user-images.githubusercontent.com/4929112/156556322-a2b58b85-6eb2-4860-a891-934ee77dee84.png) ---- CC: @nrcventura @pjanotti @pellared @zacharycmontoya @rajkumar-rangaraj
Author: RassK
Assignees: -
Labels: `area-AssemblyLoader-coreclr`, `untriaged`
Milestone: -
jkotas commented 2 years ago

The way that requires editing existing config file isn't perfect, but definitely one way to do it.

I think that is the best way to do it. There is a lot of policy that goes into patching application dependencies. For example, what happens if the application bundles a private build of System.Diagnostics.DiagnosticSource? Editing the application, or even better - telling the application to reference the required version of System.Diagnostics.DiagnosticSource in the first place, is the most flexible solution that allows you to deal with all corner cases and produce the right diagnostic for it.

RassK commented 2 years ago

The one option is to reuse additional-deps.json and do merge action with application deps, with highest version win. If that happens early enough, then it should be the same, as modifying application deps directly. This way could also benefit plugin based systems, if there is no need to update core system's deps. Currently some systems I know are using an extra restart to fix this issue.

Just from architectural point of view, I would keep dependencies config close to it's logic. Instead of doing pre merge with main application.

nrcventura commented 2 years ago

We haven't fully explored or tried to address this problem yet, but users of OpenTelemetry .NET Auto-Instrumentation may not always have access to an application config (depending on which type of application config you are referring to). Consider a tool like Sharepoint or SiteCore (ignoring whether it runs on .NET Framework or .NET for now), these tools are installed and are then used to serve up content. Many times the users of these tools are not writing or publishing code. However, it is still desirable to extract the telemetry data of these tools to ensure that they are running in the way the users are expected them to run (which is the purpose of OpenTelemetry .NET Auto-Instrumentation). As a result, someone who maintains the installation of these tools needs a way to configure the system either through environment variables or some sort of config file, so that these dependencies can be managed appropriately without requiring rebuilding the application or updating some CI pipeline.

jkotas commented 2 years ago

Performing a silent upgrade of a library in an existing application has a good chance of breaking the application. We do not guarantee dotnet runtime libraries to be bug-for-bug compatibility between major versions. Also, AOT compilation of the application may make assumptions about the exact versions of the libraries used by the app.

The high compatible option for auto-instrumentation solutions is to work with the library versions included in the application, without upgrading them. It means that the auto-instrumentation solution may need to have multiple code paths to deal with different library versions included in the application.

RassK commented 2 years ago

Performing a silent upgrade of a library in an existing application has a good chance of breaking the application

But the silent upgrade is used anyway in the build time? Looking at the example > App -> Library 1 -> Dependency X (Version 1) App -> Library 2 -> Dependency X (Version 2) Customer's App takes reference only to the Dependency X Version 2. Meaning Library's 1 Dependency X is silently upgraded to version 2.

Also, AOT compilation of the application may make assumptions about the exact versions of the libraries used by the app.

I think it doesn't need to work with AOT at all. If users decide to go with AOT, they can take another path and invest more into taking direct references.

The high compatible option for auto-instrumentation solutions is to work with the library versions included in the application, without upgrading them. It means that the auto-instrumentation solution may need to have multiple code paths to deal with different library versions included in the application.

I'm afraid that creating a managed bridge between 2 different versions of the same dependency (one that is brought by an existing app and another via StartupHook) has much higher chances of breaking the state. I could imagine doing it with stateless libs. This is about logic, but what about performance? - syncing could be very expensive.

jkotas commented 2 years ago

Customer's App takes reference only to the Dependency X Version 2. Meaning Library's 1 Dependency X is silently upgraded to version 2.

Upgrading and unifying the library versions at application build time is fine. The assumption is that the whole app will get tested before getting deployed and any potential breaks will get addressed before that.

I'm afraid that creating a managed bridge between 2 different versions of the same dependency

The idea that the auto-instrumentation solution attaches to the library used by the app. It can be done e.g. by auto-instrumentation solution targeting lowest version of the library and lighting up for newer versions using reflection.

IIRC, it is what some existing auto-instrumentation solutions are doing to deal with this problem.

RassK commented 2 years ago

Upgrading and unifying the library versions at application build time is fine. The assumption is that the whole app will get tested before getting deployed and any potential breaks will get addressed before that.

Our approach would be very similar. It's anyway hard to imagine that hooking a StartupHook would be 100% safe. The cycle has to go through the same steps.

The idea that the auto-instrumentation solution attaches to the library used by the app. It can be done e.g. by auto-instrumentation solution targeting lowest version of the library and lighting up for newer versions using reflection.

Unfortunately the OTel SDK targets almost the latest version of System.Diagnostics.DiagnosticSource due lacking features in previous versions. Plus it wouldn't be the only one to be bridged. That could cause dependency proxying hell, if the dependency of dependency needs to be proxied... and so on.

IIRC, it is what some existing auto-instrumentation solutions are doing to deal with this problem.

As far as I know most of them just avoid taking this kind of dependencies.

reyang commented 2 years ago

@rajkumar-rangaraj FYI

pjanotti commented 2 years ago

As pointed out by @RassK, the main drawback of using the reflection approach is that the Auto-Instrumentation needs to leverage the OTel SDK and related instrumentation packages. While we could do some patching to make it work with whatever version is loaded, it seems a brittle and costly approach. @rajkumar-rangaraj shared a prototype with this approach. An issue tracks some of the problems.

@jkotas

Would a solution that requires editing application config files work for you in .NET Core?

If we could have a dotnet CLI tool to do the equivalent of a package resolution updating the application deps.json, I think we would hit a good 80/20 solution. We wouldn't cover AOT, but that is a low priority for our project.

The assumption is that the whole app will get tested before getting deployed and any potential breaks will get addressed before that.

Our scenario for this feature is DevOps, in which a rebuild from sources is not possible. Typically the application goes through some staging/testing deployment before reaching production.

jkotas commented 2 years ago

I understand what you would like to achieve. I have concerns about the consequences of what you are proposing.

We have been very intentional in keeping .NET core runtime and libraries major versions to be side-by-side and avoid automatically updating applications to newer major versions by default. It is required to allow us to evolve and innovate that always comes with intentional or unintentional breaking changes.

If we start introducing architecture where the new major versions have to be bug-for-bug compatible with previous versions, it will lead to very low tolerance for risks and stifle innovation. It is the place where .NET Framework ended up and we really do not want .NET Core to end up in the same place.

Our scenario for this feature is DevOps, in which a rebuild from sources is not possible.

How are doing DevOps for statically compiled languages like golang or Rust that are increasing in popularity and where you have to rebuild the app to change it? I would like to see the same or similar approach and architecture to be adopted for .NET.

pellared commented 2 years ago

@jkotas First of all, I want to thank you for your involvement in this thread 👍

Your concerns make sense. We do not insist on adding new functionalities to the runtime (we need to support .NET 6.0 anyway). It might be even best if we come up with a solution (it could be a workaround) together. As OTel .NET maintainers, we want to keep in touch with you and have some sort of synergy.

How are doing DevOps for statically compiled languages like golang or Rust that are increasing in popularity and where you have to rebuild the app to change it?

These are using the manual instrumentation approach which requires a lot of developers' involvement. This approach makes the OTel adoption a lot harder.

Side note: I am an OTel Go approver and I help companies adopt it.

I would like to see the same or similar approach and architecture to be adopted for .NET.

Most of the users that we are aware of (at least our customers) are constantly asking for the OTel .NET auto-instrumentation. They do not want to touch the production source code like in Go or Rust. They want to use the same (or at least similar) approach as OTel Java. Because .NET does not provide an exact equivalent to the Java Agent functionality, most vendors are creating .NET Profiler in order to reJIT the code.

However using .NET Profiler also comes with drawbacks and we were looking for new functionalities like startup hook, additional dependencies, etc. in order to have a path to get away from .NET Profiler. The Host startup hook doc even mentions that it was created for such use case

The hook could be used to set up tracing or telemetry injection [...] . The hook is separate from the entry point, so that user code doesn't need to be modified.

This hook is meant as a low-level, powerful way to inject code into the process at runtime, for use by tool developers who truly have a need for this kind of power. It should only be used in situations where modifying application code is not an option and there is not an existing structured dependency injection framework in place. An example of such a use case is a hook that injects logging, telemetry, or profiling into an existing deployed application at runtime.

But it looks like the caveats "No dependency resolution for non-app assemblies" and "No conflict resolution for dependencies shared by hooks or the app" is very troublesome for us.

Here are our current workarounds for these problems https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation/blob/main/docs/troubleshooting.md#handling-of-assembly-version-conflicts

Maybe you have some better idea? We want to avoid vendoring (copying the code) of all of the dependencies as well as using the .NET Profiler. We would love to have a "runtime configuration" solution. However, I understand that it might be impossible (at least for .NET 6.0). I imagine that having some tool that e.g. modifies the .NET binaries or some tool that changes config files (like @pjanotti suggested) could be sometimes an easier solution than e.g. requesting users to add some package references manually. Still, we are open to any advice/suggestions/comments.

@open-telemetry/dotnet-instrumentation-approvers - Please thumbs up if I am correct. Otherwise, please just make a follow-up comment (or PM me to make an edit)

FYI @open-telemetry/dotnet-approvers

RassK commented 2 years ago

I understand what you would like to achieve. I have concerns about the consequences of what you are proposing.

We have been very intentional in keeping .NET core runtime and libraries major versions to be side-by-side and avoid automatically updating applications to newer major versions by default. It is required to allow us to evolve and innovate that always comes with intentional or unintentional breaking changes.

If we start introducing architecture where the new major versions have to be bug-for-bug compatible with previous versions, it will lead to very low tolerance for risks and stifle innovation. It is the place where .NET Framework ended up and we really do not want .NET Core to end up in the same place.

Most importantly I would not ask this as a default behaviour. The concerns you are referring to are correct and should not be avoided. I'm currently thinking of an advanced dependency control that can be turned on via environment variable. Just like the Native profiler, StartupHook and Shared Dependency Store are advanced features with restrictions, dependency control could also point something like "you must know what you are doing. There is no bug-to-bug compatibility. etc".

How are doing DevOps for statically compiled languages like golang or Rust that are increasing in popularity and where you have to rebuild the app to change it? I would like to see the same or similar approach and architecture to be adopted for .NET.

Actually I would love to see .NET as more advanced and mature in extendibility, pluging and unpluging modules. Forking and keeping these in sync with upstream is very painful. Specially if you need to do minor modifications and then redo the whole packing.

jkotas commented 2 years ago

Maybe you have some better idea?

For .NET, you should be able to reduce auto-instrumentation of the application during build time into adding a single line (reference to the auto-instrumentation nuget package) to the app .csproj file. Adding a single line to the .csproj does not sound like a lot of developers' involvement.

I would update the documentation to strongly encourage dealing with auto-instrumentation and version mismatches during build time and describe the solutions that require editing of already published apps as last resort.

We want to avoid vendoring (copying the code) of all of the dependencies as well as using the .NET Profiler

Bundling the dependencies with the telemetry engine and isolating the telemetry engine from the application itself as much as possible is a robust solution for situations where you want to inject itself into the unenlightened app with potentially conflicting dependencies. You can do it by loading the telemetry engine into a its own assembly load context that is going to avoid conflicts with versions used by the application. The startup hook can then be just a simple loader that creates the assembly load context and loads the full engine into it. I understand that this does not work great for cases where you need to share the types or state with the application. You need to use reflection or have multiple versions of your engine compiled against different versions of the dependencies for that.

"No dependency resolution for non-app assemblies" and "No conflict resolution for dependencies shared by hooks or the app" is very troublesome for us.

This was intentional design decision for startup hooks. We expect complex startup hooks to use AssemblyLoadContext to isolate their dependencies from the application dependencies.

Most importantly I would not ask this as a default behaviour

If the popular telemetry solutions start depending on auto-upgrading of app dependencies, it would become de-facto default behavior and we would have to take that into account when evolving the .NET runtime and libraries.

notcool11 commented 2 years ago

At my company I built a NuGet package that wrapped a Jaeger implementation about three years ago. To implement, you literally have to install the package and make some web.config changes....and adoption has been terrible.

The current version is based on the MS otel package and is optimized for container deployments. Requires a one liner in the startup.cs and using an existing pipeline template...yep, adoption still sucks. But I'm okay with that, I'd rather a team plan-fully implement observability and consider the metrics and attributes they care about than spend a bunch of money on licensing to store data no one is using.

If teams don't care enough to spend a sprint of effort adding observability goodness to their app then let them fail.

In place tracing without a redeployment smells like a service mesh or a big expensive APM agent. Doesn't seem to belong here.

nrcventura commented 2 years ago

For .NET, you should be able to reduce auto-instrumentation of the application during build time into adding a single line (reference to the auto-instrumentation nuget package) to the app .csproj file. Adding a single line to the .csproj does not sound like a lot of developers' involvement.

I would update the documentation to strongly encourage dealing with auto-instrumentation and version mismatches during build time and describe the solutions that require editing of already published apps as last resort.

Even with the build-time solution in place and being promoted as the highly recommended solution, there are use cases where a build-time solution is just not feasible. For example, when company A purchases software from company B and "installs" it in company A's environment, the software from company B is already built. However, company A still wants to collect telemetry data from that software. In that scenario, I think that it would be acceptable for company A to change some sort of configuration (file or environment) to opt-in to forcing a different dependency to load, and should treat any such opt-in as if it was building a custom version of the software.