dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.1k stars 4.04k forks source link

Proposal: Static Linking #162

Open stephentoub opened 9 years ago

stephentoub commented 9 years ago

(Note: this proposal was briefly discussed in #98, the C# design notes for Jan 21, 2015. It's a very rough idea at the moment.)

Background

C# today supports dynamic linking, such that one assembly can reference another and with the latter being available to the former at run time, loaded at run time to be used by the former, etc.

Problem

Helper functionality, and in particular extension methods, often get put into helper libraries that other assemblies can use. It's often the case that these helper libraries grow significantly in size but that a given project which uses that library only needs a few of the helpers it contains, yet the only real options available to a developer today are to either expose the helpers as source that can be selectively compiled into the consuming assembly, or distribute the helper library as a DLL with the project, even though only a small portion of it is being used.

Solution

We should consider adding some form of static linking to the C# compiler. When project A adds a static reference to assembly/project B, rather than actually adding a reference to the resulting assembly, the compiler would pull copy the IL from the referenced assembly into the consuming assembly. As a result, the referenced assembly would not need to be distributed with the referencing assembly, and only the functionality the referencing assembly used would be brought in.

There are of course complications with this. For example, if a method being used via static linking uses a static field, what becomes of that field, especially if multiple assemblies in the same project statically link against the same library? Such issues would need to be explored. A potential approach would be to have a new attribute, e.g. [StaticallyLinkable], which would need to be applied to any type/method/field/etc. that supported being statically linked against, essentially forcing the developer of the library to opt-in to being statically linked such that s/he designed it with static linking in mind. When consuming functionality via a static link, the compiler would verify that all entities reachable from the functionality being referenced are either themselves [StaticallyLinkable] or are dynamically referenced by the consuming assembly. Etc.

scalablecory commented 9 years ago

+1. May be able to take some inspiration from C++'s "One Definition Rule" that their link-time compilers use to static link shared dependencies.

foxesknow commented 9 years ago

Would there be a way to disable this if I'm the author of project B..? I may not want my intellectual property baked into project A.

damianh commented 9 years ago

I don't know the protocol for showing support for issues and I'm wary of cluttering up github issues with noise, so if this is inappropriate please let me know

+1m As a library developer, not being able to internalize non-transitive dependencies at compile time is a major source of pain. I don't need to explain the problems with ilmerge etc here as they are detailed at https://github.com/aspnet/XRE/issues/819

@foxesknow This technically won't be possible. You can't prevent it now either. You can only restrict such via licensing. See GPL and related.

aolszowka commented 9 years ago

Static Linking brings along all of the baggage that dynamic linking fixes specifically:

In today's world with dynamic linking you have two options:

  1. Drop the new assembly into the GAC and create a custom App.config with the appropriate bindingRedirect tag (https://msdn.microsoft.com/en-us/library/eftw1fys%28v=vs.110%29.aspx) and go.
  2. Compile or in more extreme cases back-port the necessary bug/security fix compile with the same version and dump those into the GAC (after removing the "bad" ones).

Both of these fixes can be done by an end user (most likely a System Administrator) without the need to involve the original developers or a completely new drop.

Today you can take an update on the .NET Framework and in theory benefit from a performance update or an incremental bug fix without recompiling your code.

Now lets say that the developer had the ability to compile statically and chose to do so. The only recourse for an end user (beyond the insane of disassembling the binary and rebuilding it themselves or some creative hooks with detours) would be to hope that the company that developed the software package cares enough to rebuild them a binary with the fix included.

I would propose what most developers are really looking for is a Windows equivalent of dpkg/apt-get functionality to maintain complex library dependencies as well as deployment scenarios. On the development side you have NuGet, but there is no equivalent for the deployment side. Personally I think its beyond the scope of the .NET Framework team to reinvent the wheel of dependency management and frankly I think the requests for static linking are just an artifact of this issue, and in my opinion a bad way to try and address the issue.

*Edit I should add that if you truly want static linking I feel a better way to address this would be to open up the source and update ILMerge (http://research.microsoft.com/en-us/people/mbarnett/ilmerge.aspx) I would say that from a Development standpoint if a third party vendor attempts to ship us a binary that is ILMerged I would look at them very suspiciously and voice my concerns in starting a business relationship with them long term for the reasons I listed above.

damianh commented 9 years ago

I don't think anyone is suggesting that supporting static linking would mean the end of dynamic linking.

if a third party vendor attempts to ship us a binary that is ILMerged I would look at them very suspiciously

I guess you oppose .NET native too.

aolszowka commented 9 years ago

GAC is dead in .NET Core.

Was this officially announced somewhere? I'm interested to see how they intend to do this, especially when you have apps that host the CLR as part of their plug-in system/extensibility point. The default Fusion probing rules do not lend themselves to killing the GAC. Is their intent to XCOPY Deploy the same libraries for every application you ship? Seems like a real good way to bloat your installers (something people already complain about having to ship the entire VCRT) especially if you have several shared components.

Dropping in replacement libs is not without it's own set of problems

I don't believe I said it was problem free? I believe my point was that you are able to do it. Statically linking your binaries prevent you from doing this without using an extreme work-around. I'm not excited at the prospect of everyone having their own copy of System.Collections.Generic baked into their application.

This would also go against the mantra that Habib Heydarian (speaking for the .NET Platform Team) spoke about at last years //Build Conference (the Desktop Development Panel) in which he stated "The motto is that your applications just get better as you upgrade to the latest version of .NET. We'd like you to do that without you doing any work in fact. So by just upgrading to version 4.5.1 high DPI just works, or your performance issues just start to go away".

Unless Microsoft Research has been hiding an amazing tool that allows us to correct the mistakes of programmers in the past who statically linked shared libraries I'm not sure how they're going to accomplish that without dynamical linking.

I guess you oppose .NET native too.

I don't think I ever said that? Where was that posted?

terrajobst commented 9 years ago

@aolszowka

GAC is dead in .NET Core. Was this officially announced somewhere?

We don't have official documentation yet but my blog post provides a good overview, especially section "Machine-wide frameworks versus application-local frameworks" (emphasis added here):

The NuGet based delivery also turns the .NET Core platform into an app-local framework. The modular design of .NET Core ensures that each application only needs to deploy what it needs. We’re also working on enabling smart sharing if multiple applications use the same framework bits. However, the goal is to ensure that each application is logically having its own framework so that upgrading doesn’t interfere with other applications running on the same machine.

The GAC is a concept that is only applicable to the .NET Framework. The .NET Core platform doesn't have this concept and we currently don't plan on adding it, for the reasons stated above.

This would also go against the mantra that Habib Heydarian (speaking for the .NET Platform Team) spoke about at last years //Build Conference (the Desktop Development Panel) in which he stated "The motto is that your applications just get better as you upgrade to the latest version of .NET."

That's totally true but .NET Core isn't a higher version of the .NET Framework. It's conceptually a different platform. We'll try our best to make the migration experience as smooth as possible but .NET Core required some changes to the very bottom of the platform in order to incorporate the new scenarios, especially around open source and agility. However, you'll be able to author libraries that can work on either platform. We consider this ability super critical because it is part of our continuous effort to unify the different .NET platforms, at least by API shape.

Unless Microsoft Research has been hiding an amazing tool that allows us to correct the mistakes of programmers in the past who statically linked shared libraries I'm not sure how they're going to accomplish that without dynamical linking.

Static linking poses challenges for servicing. But servicing shared components that are potentially used by a large number of applications on the same machine also has issues. We got burned by this as well in the sense and update broke existing applications. However, as my post said, we very much care about servicing critical security updates:

While app-local deployment is great for isolating the impact of taking dependencies on newer features it’s not appropriate for all cases. Critical security fixes must be deployed quickly and holistically in order to be effective. We are fully committed to making security fixes as we always have for .NET.

Does this help?

avanderhoorn commented 9 years ago

@terrajobst I know this (https://github.com/aspnet/XRE/issues/819) issue is linked above, but have you had a chance to have a look to see the set of problems we have there? They are different to a lot of the reasons I've seen here for static linking and it would be good to get your perspective.

Joe4evr commented 9 years ago

I would propose what most developers are really looking for is a Windows equivalent of dpkg/apt-get functionality to maintain complex library dependencies as well as deployment scenarios. On the development side you have NuGet, but there is no equivalent for the deployment side.

So I take it you haven't heard of Chocolatey yet?

aolszowka commented 9 years ago

@Joe4evr

So I take it you haven't heard of Chocolatey yet?

I actually have, and we use it internally to setup new developers, but forgot about it. Thank you for the link.

@terrajobst

We don't have official documentation yet but my blog post provides a good overview

Thank you for the link, I find it interesting that a lot of the issues you mentioned in the first half of the Machine-wide frameworks section were issues that would have been avoided had 4.5.x used side-by-side deployment similar to .NET 3.0, 3.5, and 4.0. I seem to remember questioning the statement of a "highly compatible, in place upgrade" when 4.5 was announced, its good to see that I wasn't the only one.

Forgive me, but I still don't see how this change means the GAC is dead, if anything it means the GAC is much more critical to your vision of Applications with their own individual frameworks consider the following scenario:

You now have two options in this scenario:

If you attempted an XCOPY Deploy in this scenario you'd be right back in "DLL Hell", something the GAC purported to fix.

This post also does not address the scenario I mentioned above:

I'm interested to see how they intend to do this, especially when you have apps that host the CLR as part of their plug-in system/extensibility point. The default Fusion probing rules do not lend themselves to killing the GAC.

Consider the following application:

MyCoolRuntime.exe (located in C:\Program Files\My Cool\MyCoolRuntime.exe) which is a distinct language/runtime, hosts the CLR to provide an interop to all the cool features that come with the .NET Framework. Now consider that the developers of this new runtime did nothing to override the default Fusion Assembly Loading Rules, most specifically the Probing rules (https://msdn.microsoft.com/en-us/library/yx7xezcf%28v=vs.110%29.aspx). Without the use of the GAC the .NET Core library will attempt to load from:

This is obviously not the desired affect. Before you claim that this is only a hypothetical scenario I can assure you that there is at least one runtime that exhibits the above behavior (but in the Spirit of Raymond Chen's OldNewThing blog we won't name names). The "correct fix" (assuming you really want to kill the GAC) in this scenario is to force the runtime writers to support codeBase (https://msdn.microsoft.com/en-us/library/efs781xb%28v=vs.110%29.aspx) which in some cases may very well be impossible (for example the company is no longer in business, or no longer supports an older version of their runtime). This also means groveling in the appropriate app.config in a folder you should not technically own (assuming they do not provide a programmatic interface).

The only other comment I'd make on that post (I'll save the rest to reply directly on the MSDN Blogs) is how you'll handle version-ing with hotfixes. You've already mentioned in the blog post you'll use Semantic Versioning, but that doesn't address how you'll handle deployed applications that need in-field servicing. Remember that any version bump (regardless of MAJOR.MINOR.BUILD.REVISION) causes you either to:

-or-

Based on what I've seen here it seems like you are leaning more towards bullet point 1 (recompiling all projects to take the update; this is taken from the comment of "can now prompt you to upgrade your references to System.Collections.") which would put you at extreme odds with large deployments. My understanding of how the .NET Framework team currently works around this is they just simply don't bump on hotfixes (honestly I have not dug far enough into it to see what their solution was).

While app-local deployment is great for isolating the impact of taking dependencies on newer features it’s not appropriate for all cases. Critical security fixes must be deployed quickly and holistically in order to be effective. We are fully committed to making security fixes as we always have for .NET. Does this help?

I appreciate the fact that you are taking this into consideration, I am just more interested in how you intend to accomplish this.

Based on the above perhaps it is more helpful to think of .NET Core as another set of libraries from a third party (where the third party just happens to be Microsoft) and treat them with the same care and respect we would with any other third party library (such as Ionic.Zip).

To try and pull this back to the conversation at hand:

akoeplinger commented 9 years ago

@aolszowka

  • You have a large Enterprise Suite (100 Individual "Applications") that is shipped as a single package.
  • One of your newer "applications" takes a dependency on a newer version of the .NET Core Library to leverage all the cool new features you guys will be adding.
  • Someone (most likely management) dictates that the cost to re-test and re-certify the remaining 99 "Applications" is too high to force them to take an upgrade on the newer dependency.

This is exactly one of the scenarios .NET Core aims to make easier. Each application can ship their own set of .NET Core NuGet packages, which allows you to take advantage of new features without impacting other applications on the same machine.

You've already mentioned in the blog post you'll use Semantic Versioning, but that doesn't address how you'll handle deployed applications that need in-field servicing.

For serious security fixes only, the assembly loader in CoreCLR can override a vulnerable DLL with a fixed version (that got installed via Windows Update) when loading an assembly, without you needing to recompile. For other updates you simply get the new version via NuGet and ship it with your next application update.

It's also worth noting that there's no fusion in CoreCLR, so you need to leave the restrictions and mindset it brings behind.


To avoid getting even more off-topic, I think static linking has it benefits, even though it might not be the right tool everywhere. It's a tradeoff each developer needs to make when taking a dependency on an external assembly.

aolszowka commented 9 years ago

@akoeplinger

Each application can ship their own set of .NET Core NuGet packages

This particular application ships and is certified as a single application. While the "Applications" are distinct in their purpose they are version-ed and shipped together as they are intended to operate as one (each application has a certain area of concern, perhaps one that deals with Items, another one that deals with Customers, etc). I am still perplexed as to how they will ship their own sets of the .NET Core NuGet Packages as they all sit in the same application folder.

For serious security fixes only, the assembly loader in CoreCLR can override a vulnerable DLL with a fixed version (that got installed via Windows Update) when loading an assembly, without you needing to recompile.

Is the code for this made public somewhere yet?

For other updates you simply get the new version via NuGet and ship it with your next application update.

Again business realities indicate that you cannot always retest the remaining 99 Applications within a short time frame (based on the above we're looking at a cadence of about 4 months, currently we ship every 6 weeks).

I would love to continue this conversation somewhere else to understand what the direction is here and how it can help us develop better code. We've gone very far off topic for this thread, and I appreciate your willingness to engage in conversation here.

damianh commented 9 years ago

@aolszowka

beyond relink and redeploy until someone provides a servicing solution

We call that continuous delivery. Currently .NET is difficult in this respect because of shared dlls and things that must be installed machine wide. Too much coupling, not anywhere near enough isolation.

Now just because one lib has something statically linked in it, does not mean it will force that all they way up the chain. Also, if you are using Nuget and any of the numerous popular open source packages (NHibernate, Ninject come to mind) you are almost certainly already using ilmerged dlls right now.

akoeplinger commented 9 years ago

@aolszowka

This particular application ships and is certified as a single application. [...] I am still perplexed as to how they will ship their own sets of the .NET Core NuGet Packages as they all sit in the same application folder.

That's not possible afaik, you need to put them in separate folders (I'd say your app layout is not what .NET Core focuses on right now, the initial goal is on enabling web applications for aspnet5).

Is the code for this made public somewhere yet?

Not yet I think.

Again business realities indicate that you cannot always retest the remaining 99 Applications within a short time frame (based on the above we're looking at a cadence of about 4 months, currently we ship every 6 weeks).

Yes, but in your case you really don't have 99 applications, you have one (albeit a large app). That depends on how you define "application" of course, but it's the way .NET Core works atm.

I would love to continue this conversation somewhere else to understand what the direction is here and how it can help us develop better code.

I think the proper place would be the .NET Foundation CoreFx Forums.

terrajobst commented 9 years ago

@aolszowka, @akoeplinger, @damianh:

Sorry to be a spoilsport, but I think we've exceeded the threshold of when the discussion becomes off topic. I think what @aolszowka is arguing for is having a GAC for .NET Core in order support centralized servicing. In that case, I believe it would be better to open an issue under http://github.com/dotnet/corefx so that this discussion can focus on static linking.

I think it's OK to raise the concern that centralized servicing doesn't play well with static linking, which is totally true assuming the servicing story is limited to individual components.

An alternative design allows centralized servicing of the app itself. In this world, it's possible to push updates for any changes, including changes to the app itself, not just framework-like components that happen to live in the GAC. That's essentially the model that the touch based devices use (Windows Store/Phone, iOS, Android).

MgSam commented 9 years ago

I think static linking is helpful in the scenario that the proposal identifies. To prevent abuse I agree that an assembly should have to opt-in to allow it to be statically linked. I think this also addresses the concern about difficultly in applying security updates- anything substantial enough that it might later require security patches should not be able to be statically linked.

akoeplinger commented 9 years ago

@stephentoub can you elaborate a bit more about how this proposal is different/better than existing tools such as ILMerge?

@MgSam I'm not sure I understand how an additional attribute would prevent abuse, given that I already can ILMerge arbitrary assemblies today.

stephentoub commented 9 years ago

@akoeplinger, basically the proposal is to help address cases where you don't want the whole referenced assembly, but just a small part of it. Effectively it'd be like ILMerge'ing but with tree shaking away everything you don't use, built into the compiler, and with the intention that you really only use a small portion of the target assembly, e.g. a few extensions methods out of library of many, or a few P/Invokes out of a library of many, etc.

RichiCoder1 commented 9 years ago

@stephentoub .Net native without the Nativy-ing, no?

adamralph commented 9 years ago

+∞

BrannonKing commented 9 years ago

The WPF-auto-generated application pack includes would need to be revamped to make this work.

paulomorgado commented 9 years ago

@stephentoub, are you thinking about something like NoPIA? Type equivalence wouldn't play a role here and would require to intern implementation and not only interface definitions.

I see this being useful only for .NET Core and assuming only one assembly is doing this. If more that one assembly is interning the same methods, it would defeat the purpose.

I think .NET Core will force libraries to be more fine-grained and this won't really be a problem.

GeirGrusom commented 9 years ago

Effectively it'd be like ILMerge'ing but with tree shaking away everything you don't use, built into the compiler

Why does it have to be in the compiler?

A separate tool would be better than the compiler at this, as it could work recursively, which the compiler would be unable to. Also this feature would be very counter-productive during development and debugging. Users also has to know beforehand if a library can be statically linked at all, so you can't just toggle "Statically link all libraries" because it could be prohibited by third party licences.

diab0l commented 9 years ago

Now for a problem that for some reason has not been mentioned yet.

Statically linking all of the actually used IL runs into the same problem as in .Net Native and other AOT solutions: Because of reflection there's no good way to automatically determine what IL is actually used and what isn't.

That means, libraries which use reflection might behave differently when statically linked than when dynamically linked. The effects might be subtle. Uhg.

gafter commented 9 years ago

While I'm sympathetic to the use cases that motivate this proposal, the C# compiler feels like the wrong place to put the functionality.

whoisj commented 9 years ago

:+1: and does this mean we'll need to kill Reflection to make it work? 'cause I'd be so very happy if we did. :smirk:

masaeedu commented 8 years ago

@gafter Where would be a better place to put it? I imagine an alternative would be a tool that can take a bunch of compiled DLLs, inspect the IL, and perform the necessary surgery to produce a minimal output binary. Like a souped up ilmerge.

@whoisj @diab0l As mentioned in the proposal, even without reflection there are issues like static fields to contend with. Although the ability to magically tree-shake some arbitrary bits from today into a tiny DLL would be incredible, users will probably end up having to specify some extra metadata in their code to guide the compiler when it comes to what is statically linkable and what isn't. Code that uses reflection would not qualify.

gluck commented 8 years ago

Might be worth mentioning here that tree-shaking/linker has been implemented in ILRepack as of 2.1.0-beta1, you can try it out to see if the added complexity is worth it (not convinced myself). Doc/usage/gotchas: https://github.com/gluck/il-repack/tree/linker#linker

alrz commented 7 years ago

Is inlining relevant here? Could we inline functions in a statically linked library?

jcouv commented 7 years ago

Relates to https://github.com/dotnet/announcements/issues/30 (Introducing .NET IL Linker)

damianh commented 7 years ago

.NET IL Linker is for applications only. Static linking still needed for libraries.

gilescope commented 5 years ago

Dependency resolution is a nightmare in .net these days. At least with static linking you'd know at compile/link time that your program did or did not link. Someone suggested this would lead to fatter binaries - great, disk space is cheap. Coming from a Rust background (and a long .net history too) I can tell you that static linking is a great place to be. With large solutions (100+ projects) fighting the dependencies is probably the hardest part of making progress in .net. This is a shame as otherwise the C# language is pretty productive.