dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.47k stars 4.76k forks source link

Support single-file distribution #11201

Closed morganbr closed 4 years ago

morganbr commented 6 years ago

This issue tracks progress on the .NET Core 3.0 single-file distribution feature. Here's the design doc and staging plan for the feature.

mattwarren commented 6 years ago

Out of interest, how does this initiative compare to CoreRT? They seem like similar efforts?

Is it related to 'possibly native user code', i.e. it this will still allow code to be JIT-compiled, not just AOT?

Also, I assume that the runtime components ('Native code (runtime, host, native portions of the framework..') will be the ones from the CoreCLR repo?

morganbr commented 6 years ago

You're asking great questions, but since this is still early in design, I don't have great answers yet.

Out of interest, how does this initiative compare to CoreRT? They seem like similar efforts?

There would likely be somewhat similar outcomes (a single file), but the design may have different performance characteristics or features that do/don't work. For example, a possible design could be to essentially concatenate all of the files in a .NET Core self-contained application into a single file. That's 10s of MB and might start more slowly, but on the other hand, it would allow the full capabilities of CoreCLR, including loading plugins, reflection emit and advanced diagnostics. CoreRT could be considered the other end of the spectrum -- it's single-digit MB and has a very fast startup time, but by not having a JIT, it can't load plugins or use reflection emit and build time is slower than most .NET devs are used to. It currently has a few other limitations that could get better over time, but might not be better by .NET Core 3.0 (possibly requiring annotations for reflection, missing some interop scenarios, limited diagnostics on Linux). There are also ideas somewhere between the two. If folks have tradeoffs they'd like to make/avoid, we'd be curious to hear about them.

Is it related to 'possibly native user code', i.e. it this will still allow code to be JIT-compiled, not just AOT?

By "native user code," I meant that your app might have some C++ native code (either written by you or a 3rd-party component). There might be limits on what we can do with that code -- if it's compiled into a .dll, the only way to run it is off of disk; if it's a .lib, it might be possible to link it in, but that brings in other complications.

Also, I assume that the runtime components ('Native code (runtime, host, native portions of the framework..') will be the ones from the CoreCLR repo?

Based on everything above, we'll figure out which repos are involved. "Native portions of the framework" would include CoreFX native files like ClrCompression and the Unix PAL.

ayende commented 6 years ago

A single file distribution in this manner, even if has slightly slower startup time, can be invaluable for ease of deployment. I would much rather have the ability to have the full power than be forced to give up some of that.

Some scenarios that are of interest to us. How would this work in terms of cross platform? I assume we'll have a separate "file" per platform?

With regards to native code, how would I be able to choose different native components based on the platform?

TheBlueSky commented 6 years ago

Some scenarios that are of interest to us. How would this work in terms of cross platform? I assume we'll have a separate "file" per platform? With regards to native code, how would I be able to choose different native components based on the platform?

@ayende, I'm quoting from @morganbr comment:

a possible design could be to essentially concatenate all of the files in a .NET Core self-contained application into a single file.

The current cross-platform story for self-contained applications is creating a deployment package per platform that you'd like to target, because you ship the application with the runtime, which is a platform-specific.

mattwarren commented 6 years ago

@morganbr I appreciate you taking to time to provide such a detailed answer

I'll be interested to see where the design goes, this is a really interesting initiative

morganbr commented 6 years ago

I have a few questions for folks who'd like to use single-file. Your answers will help us narrow our options:

  1. What kind of app would you be likely to use it with? (e.g. WPF on Windows? ASP.NET in a Linux Docker container? Something else?)
  2. Does your app include (non-.NET) C++/native code?
  3. Would your app load plugins or other external dlls that you didn't originally include in your app build?
  4. Are you willing to rebuild and redistribute your app to incorporate security fixes?
  5. Would you use it if your app started 200-500 ms more slowly? What about 5 seconds?
  6. What's the largest size you'd consider acceptable for your app? 5 MB? 10? 20? 50? 75? 100?
  7. Would you accept a longer release build time to optimize size and/or startup time? What's the longest you'd accept? 15 seconds? 30 seconds? 1 minute? 5 minutes?
  8. Would you be willing to do extra work if it would cut the size of your app in half?
tpetrina commented 6 years ago
  1. Console/UI app on all platforms.
  2. Maybe as a third party component.
  3. Possibly yes.
  4. Yes, especially if there is a simple ClickOnce-like system.
  5. Some initial slowdown can be tolerated. Can point 3 help with that?
  6. Depends on assets. Hello world should have size on the order of MB.
  7. Doesn't matter if it is just production.
  8. Like whitelisting reflection stuff? Yes.
TheBlueSky commented 6 years ago

@morganbr, do you think that these questions are better asked to a broader audience; i.e., broader that people who know about this GitHub issue?

benaadams commented 6 years ago

For example, a possible design could be to essentially concatenate all of the files in a .NET Core self-contained application into a single file.

Looking at compressing it; or using a compressed file system in the file?

morganbr commented 6 years ago

@tpetrina, thanks! Point 3 covers a couple of design angles:

  1. Tree shaking doesn't go well with loading plugins that the tree shaker hasn't seen since it could eliminate code the plugin relies on.
  2. CoreRT doesn't currently have a way to load plugins Point 5 is more about whether we'd optimize for size or startup time (and how much) Point 8, yes I was mostly thinking about reflection stuff

@TheBlueSky, we've contacted other folks as well, but it helps to get input from the passionate folks in the GitHub community.

@benaadams, compression is on the table, but I'm currently thinking of it as orthogonal to the overall design. Light experimentation suggests zipping may get about 50% size reduction at the cost of several seconds of startup time (and build time). To me, that's a radical enough trade-off that if we do it, it should be optional.

Suchiman commented 6 years ago

@morganbr several seconds of startup time when using compression? I find that hard to believe when considering that UPX claims decompression speeds of

~10 MB/sec on an ancient Pentium 133, ~200 MB/sec on an Athlon XP 2000+.

ayende commented 6 years ago

@morganbr, for me the answers are:

1) Service (console app running Kestrel, basically). Running as Windows Service / Linux Daemon or in docker. 2) Yes 3) Yes, typically managed assemblies using AssemblyContext.LoadFrom. These are provided by the end user. 4) Yes, that is expected. In fact, we already bundle the entire framework anyway, so no change from that perspective. 5) As a service, we don't care that much for the startup time. 5 seconds would be reasonable. 6) 75MB is probably the limit. A lot depends on the actual compressed size, since all packages are delivered compressed. 7) For release builds, longer (even much longer) build times are acceptable. 8) Yes, absolutely. Size doesn't matter that much, but smaller is better.

Something that I didn't see mentioned and is very important is the debuggability of this. I hope that this isn't going to mangle stack traces, and we would want to be able to include pdb files or some sort of debugging symbols.

ayende commented 6 years ago

About compression, take into account the fact that in nearly all cases, the actual delivery mechanism is already compressed. For example, nuget packages. Users are also pretty well versed in unzipping things, so that isn't much of an issue. I think you can do compression on the side.

morganbr commented 6 years ago

Thanks, @ayende! You're right that I should have called out debuggability. I think there are only a few minor ways debugging could be affected:

  1. It might not be possible to use Edit and Continue on a single-file (due to needing a way to rebuild and reload the original assembly)
  2. The single-file build might produce a PDB or some other files that are required for debugging beyond those that came with your assemblies.
  3. If CoreRT is used, it may have some debugging features that get filled in over time (especially on Linux/Mac).

When you say "include pdb files", do you want those inside the single file or just the ability to generate them and hang onto them in case you need to debug the single-file build?

ayende commented 6 years ago

1) Not an issue for us. E&C is not relevant here since this is likely to be only used for actual deployment, not day to day. 2) Ideally, we have a single file for everything, including the PDBs, not one file and a set of pdbs on the side. There is already the embedded PDB option, if that would work, it would be great. 3) When talking about debug, I'm talking more about production time rather than attaching a debugger live. More specifically, stack trace information including file & line numbers, being able to resolve symbols when reading dump, etc.

bencyoung commented 6 years ago
  1. Mainly services but some UI
  2. Some do, but this wouldn't be urgent
  3. Yes
  4. Yes
  5. A few seconds is ok
  6. Doesn't matter to us. Sum of dll size is fine
  7. Ideally not
  8. Size is not of primary importance for us

Another question for us is whether you'd be able to do this for individual components too (perhaps even staged)? E.g. we have library dlls that use lots of dependencies. If we could package those it would save a lot of pain of version management etc. If these in turn could be packaged into an exe that would be even nicer?

Kosyne commented 6 years ago
  1. Services and some UI.
  2. Not at the moment.
  3. Yes. Ideally plugins that could be loaded from a folder and reloaded at runtime.
  4. Yes
  5. Not a problem so long as we aren't pushing 10-15+.
  6. Sum of DLL size, or similar.
  7. Yes. For a production build time isn't really a problem so long as debug/testing builds build reasonably quick.
  8. Depends, but the option would be handy.
expcat commented 6 years ago
  1. Service and UI.
  2. Sometimes.
  3. Yes, usually.
  4. Yes.
  5. It is best to be less than 5 seconds.
  6. The UI is less than 5 seconds, Service doesn't matter.
  7. The build time is not important, and the optimization effect is the most important.
  8. Yes.
MichalStrehovsky commented 6 years ago

@tpetrina @ayende @bencyoung @Kosyne @expcat you responded yes to question 3 ("Would your app load plugins or other external dlls that you didn't originally include in your app build?") - can you tell us more about your use case?

The main selling point of a single file distribution is that there is only one file to distribute. If your app has plugins in separate files, what value would you be getting from a single file distribution that has multiple files anyway? Why is "app.exe+plugin1.dll+plugin2.dll" better than "app.exe+coreclr.dll+clrjit.dll+...+plugin1.dll+plugin2.dll"?

ayende commented 6 years ago

app.exe + 300+ dlls - which is the current state today is really awkward. app.exe + 1-5 dlls which are usually defined by the user themselves is much easier.

Our scenario is that we allow certain extensions by the user, so we would typically only deploy a single exe and the user may add additional functionality as needed.

It isn't so much that we plan to do that, but we want to be able to do that if the need arise.

bencyoung commented 6 years ago

@ayende Agreed, same with us.

Also, if we could so this at the dll level then we could package dependencies inside our assemblies so they didn't conflict with client assemblies. I.e. by choosing a version of NewtonSoft.Json you are currently defining it for all programs, plugins and third-party assemblies in the same folder, but if you could embed it then third-parties have flexibility and increase version compatibility

expcat commented 6 years ago

Agree with @ayende .

morganbr commented 6 years ago

Thanks, everyone for your answers! Based on the number of folks who will either use native code or need to load plugins, we think the most compatible approach we can manage is the right place to start. To do that, we'll go with a "pack and extract" approach.

This will be tooling that essentially embeds all of the application and .NET's files as resources into an extractor executable. When the executable runs, it will extract all of those files into a temporary directory and then run as though the app were published as a non-single file application. It won't start out with compression, but we could potentially add it in the future if warranted.

The trickiest detail of this plan is where to extract files to. We need to account for several scenarios:

I think we can account for all of those by constructing a path that incorporates:

  1. A well-known base directory (e.g. %LOCALAPPDATA%\dotnetApps on Windows and user user profile locations on other OSes)
  2. A separate subdirectory for elevated
  3. Application identity (maybe just the exe name)
  4. A version identifier. The number version is probably useful, but insufficient since it also needs to incorporate exact dependency versions. A per-build guid or hash might be appropriate.

Together, that might look something like c:\users\username\AppData\Local\dotnetApps\elevated\MyCoolApp\1.0.0.0_abc123\MyCoolApp.dll (Where the app is named MyCoolApp, its version number is 1.0.0.0 and its hash/guid is abc123 and it was launched elevated).

There will also be work required to embed files into the extractor. On Windows, we can simply use native resources, but Linux and Mac may need custom work.

Finally, this may also need adjustments in the host (to find extracted files) and diagnostics (to find the DAC or other files).

CC @swaroop-sridhar @jeffschwMSFT @vitek-karas

Kosyne commented 6 years ago

I feel like this cure is worse than the disease. If we have to deal with external directories (different across OS's), updating, uninstalling and the like, that flies in the face of my reason for desiring this feature in the first place (keeping everything simple, portable, self contained and clean).

If it absolutely has to be this way, for my project, I'd much prefer a single main executable and the unpacked files to live in a directory alongside that executable, or possibly the ability to decide where that directory goes.

That's just me though, I'm curious to hear from others as well.

ChristianSauer commented 6 years ago

I have to agree here, using a different directory can have many exciting problems - e.g. you place a config file alongside the exe and this exe is not picked up because the "real" directory is somewhere else. Disk space could be a problem to, also random file locks due to access policies etc. pp. I would like to use this feature, but not if it adds a host of faile modes which are impossible to detect before.

strich commented 6 years ago

Agreed with @Kosyne - The proposed initial solution seems to simply automate an "installer" of sorts. If that was the limit of the problem we're trying to solve with a single exec then I think we'd have all simply performed that automation ourselves.

The key goal of the single exec proposal should be to be able to run an executable on an unmanaged system. Who knows if it even has write access to any chosen destination "install" directory? It should certainly not leave artefacts of itself after launch either (not by default).

As a small modification to the existing proposal to satisfy the above: Could we not unpack into memory and run from there?

ayende commented 6 years ago

Agree with the rest of the comments. Unzipping to another location is something that is already available. We can have a self extracting zip which will run the extracted values fairly easily. That doesn't answer a lot of the concerns that this is meant to answer and is just another name for installation.

The location of the file is important. For example, in our case, that would mean:

One of our users need to run our software from a DVD, how does that works, on a system that may not actually have a HD to run on.

I agree that it would be better to do everything in memory. And the concern about the startup time isn't that big, I would be fine paying this for every restart, or manually doing a step to alleviate that if needed.

Another issue here is the actual size. If this is just (effectively) an installer, that means that we are talking about file sizes for a reasonable app in the 100s of MB, no?

GSPP commented 6 years ago

It seems that building the proposed solution does not require (many if any) CLR changes. Users can already build a solution like that. There is no point in adding this to CoreCLR. Especially, since the use case for this is fairly narrow and specific.

ayende commented 6 years ago

@GSPP This seems like basically something that I can do today with 7z-Extra, I agree that if this is the case, it would be better to not have it at all.

lfr commented 6 years ago

Sorry I'm late to this party, I got here after following a link posted in a duplicate ticket that I was tracking. After reading the latest comments here, I'm sorry to see that you're considering packing and extracting. This seems like overkill, why not start with the ability to deploy SFAs for basic console apps? It seems to me that it should be possible to create a rudimentary console app with some network, IO, and some external dependencies (nuget, etc) that sits in a single file. I guess what I'm trying to say is that instead of gathering the requirements of everyone, gather the requirements of no one and instead start small with a first iteration that makes sense for everyone and yields results quickly.

jkotas commented 6 years ago

This seems like basically something that I can do today with 7z-Extra

You are right that number of programming environment agnostic solutions to address this problem exist today. Another example out of many: https://www.boxedapp.com/exe_bundle.html

The added value here would be integration into the dotnet tooling so that even non-expert users can do it easily. Expert users can do this today by stitching existing tools together as you have pointed out.

Personally, I agree with you that it is not clear that we are making the right choice here. We had a lot of discussion about this within the core team.

ayende commented 6 years ago

Putting an intern on it and coming out with a global tool (that is recommended, but not supported) would do just as well, and can be fairly easy to install as well.

Effectively, we are talking about dotnet publish-single-file and that would do everything required behind the scene.

I don't see anything that is actually required by the framework or the runtime to support this scenario, and by making this something that is explicitly outside the framework you are going to allow users a lot more freedom about how to modify this. No "need" to get a PR (with all the associated ceremony, backward compact, security, etc) that you would get if you want to make a change to the framework. You just fork a common sideline project and use that.

Note that as much as I would like this feature, I would rather not have something in (which means that it is always going to be in) that can be done just as well from the outside.

swaroop-sridhar commented 6 years ago

I want to ask a higher level question: What is the main motivation for customers desiring single-file distribution? Is it primarily:

  1. Packaging? If so, regardless of whether the solution is inbox or a third party tool, what characteristics are most important? a) Startup time (beyond the first run) b) Ability to run in non-writable environments c) Not leaving behind files after the run d) Not having to run an installer 2) Performance? a) Speed (static linking of native code, avoiding multiple library loads, cross-module optimizations, etc.) b) Code size (need only one certificate, tree shaking, etc) 3) Any others?
strich commented 6 years ago

I'm a bit concerned that this feature is being read by some as not so important, or at least might not be a good goal to include in the core featureset here. I'd like to just reiterate that I think it would be immensely powerful for any application that acts as a service or sub-module of a larger application. One that you, the author, may not even be the developer of. I don't want to have to pass on dependency requirements that can only be resolved with installations (auto or otherwise), or post-execution artefacts on disk that might need additional security elevation by the user, etc. Right now, .NET just isn't a good choice for this niche problem. But it could be (And should be IMO).

The compiled executable must:

Re @swaroop-sridhar I daresay every application will have its own unique order of performance needs and so I'd imagine the best approach after tackling the core solution is to pick the low hanging fruit and go from there.

ayende commented 6 years ago

@swaroop-sridhar For me, this is about packaging and ease of use for the end user. No need to deal with installation of system wide stuff, just click and run.

This is important because we allow our software to be embedded, and a single file addon is a lot easier to manage.

@strich The point about embedding is a good one. We are commonly used as a component in a micro service architecture, and reducing the deployment overhead will make that easier.

The problem isn't whatever or not this is an important feature. The issue is whatever the proposed solution (essentially zipping things, at this point) is required to be in the core. Stuff that is in the core has a much higher standard for changes. Having this as an external tool would be better, because that is easier to modify and extend.

To be honest, I would much rather see a better option altogether. For example, it is possible to load dlls from memory, instead of files (require some work, but possible). If that happens, you can run the entire thing purely from memory, with unpacking being done to pure memory and no disk hits.

That is something that should go in the core, because it will very likely require modifications to the runtime to enable that. And that would be valuable in and off itself. And not something that we can currently do.

bencyoung commented 5 years ago

So a good example is to look at the experience of Go tools like Hashicorp's Consul. A single exe file that you can drop onto any machine an run. No installers, no copying folders around, no hunting for config files in lists of hundreds of files, just a really nice end user experience.

For us I'm not sure the in-memory approach would work as we'd also like this to work for plugin dlls as well (so dropping a plugin would also be a single file rather than all it's dependencies), but any progress would be good. We've looked at Fody.Costura and that works well for some stuff but we've had issues with .NET Core for that.

mikedn commented 5 years ago

So a good example is to look at the experience of Go tools like Hashicorp's Consul

Eh, for tools like consul the ideal solution would be corert, not this self-extract improvisation.

bencyoung commented 5 years ago

@mikedn Why's that? I don't see what AOT or JIT compilation has to do with deployment method?

lfr commented 5 years ago

I want to second @strich's words, single file deployments would be a breath of fresh air for microservice architecture deployments, as well as for any console app that is — or at least starts its life as — a small tool with with command line switches.

mikedn commented 5 years ago

Why's that? I don't see what AOT or JIT compilation has to do with deployment method?

Because it gives you exactly what you want - a single exe file that you can drop onto any machine (well, any machine having a suitable OS, it's a native file after all) and run. It also tends to use less resources, which for agents like consul is a good thing. It is pretty much the equivalent of what Go gives you, more than a self extract solution.

bencyoung commented 5 years ago

@mikedn I guess, but

1) It doesn't really exist yet in a production form (as far as I know)! 2) We use a lot of dynamic features (IL generation, arbitrary reflection) 3) We still want to be able to add plugins (again ideally compacted)

See as this issue was about asking people what they want, we're only giving our opinion! We don't really want to have to switch to a different runtime model just to get this benefit. To me they're orthogonal concerns

MichalStrehovsky commented 5 years ago

I don't see what AOT or JIT compilation has to do with deployment method?

Without a JIT, it's easier to get things like the debugging story good enough. The JIT part makes the problem harder which is why you won't find it in Go. This is engineering, so you either throw more engineers at the harder problem and live with the new complexity, or scope it down elsewhere. The self-extractor is about scoping it down because the number of engineers with the necessary skills is limited.

People with projects that are more like Go projects (no JIT requirements) might be pretty happy with CoreRT, if they're fine with the "experimental" label on it. It's pretty easy to try these days. It uses the same garbage collector, code generator, CoreFX, and most of CoreLib as the full CoreCLR, and produces small-ish executables (single digit megabytes) that are self-contained.

mikedn commented 5 years ago

It doesn't really exist yet in a production form (as far as I know)!

Yes, my comment was mostly targeted at MS people :grin:. They have all these parallel, related, on-going projects/ideas (corert, illinker) and now they add one more, this self extract thing that, as many already pointed out, it's a bit of a "meh, we can do that ourselves" kind of thing. And it comes with downsides as well, such as extracting files to a "hidden" directory.

We use a lot of dynamic features (IL generation, arbitrary reflection)

That's something that the community as a whole might want to give a second thought. Sure it's useful to be able to do that but it kind of conflicts with other desires like single file deployment. You can still get single file deployment but that tends to come at a cost - a rather large file. If you're OK with that then that's perfectly fine. But in my experience the larger the file the gets the less useful the "single file" aspect becomes.

bencyoung commented 5 years ago

@MichalStrehovsky sure, there are different options. However for us we can't use experimental (convincing it was time to move to .NET Core was hard enough) and I don't think extracting to a temp folder will work in our case either. However worst case is we carry on as we are and don't use this feature.

It is something we would like though if it went the way we'd want it to :)

bencyoung commented 5 years ago

@mikedn I agree. Multiple parallel solutions are even more confusing. I think our ideal solution would be some kind of super ILLinker/weaver approach but I'm happy to let this play out and see where we end up.

ericsampson commented 5 years ago

I'm really really excited about this functionality landing, but TBH I'm equally unexcited about the initial proposal that you posted @morganbr. My answers to the list of questions you posed is similar to what others posted (so I think there is a common desired set of capabilities), but IMHO the proposed 'unpack to disk' solution is not at all what I'd hope to see implemented and as others said would almost be worse than the 'disease'. @jkotas

I agree with @strich and @ayende The compiled executable must: Package all relevant dependencies Have no execution artifacts (No disk writes)

Loading .dlls from memory instead of disk may not be easy, but that's the kind of capabilities that IMO would be worth the deep expertise of MSFT low-level devs (and then leveraging in CoreClr) vs the above proposal which could just be implemented as an external tool (and already has, see https://github.com/dgiagio/warp). If this was achieved, I wonder how much time difference there would be between first and subsequent runs? For inspiration/example, I think Linux 3.17+ has memfd_create which can be used by dlopen.

On another topic, I'm wondering if the requirement to support plugins is over-indexing the design proposal? Would it be worth making this functionality opt-in, so that only the people that need this capability incur the potential penalties (lack of tree shaking etc) and everyone else (significant majority?) would get reduced deployable size, perf benefits(?)

Stepping back @swaroop-sridhar @MichalStrehovsky , I can see two broad use cases that might have different-enough goals/desires to make it hard to accomodate everyone with one solution:

I hope this braindump makes some sense, I'm not trying to be a jerk, but just provide feedback because I'm very interested in this topic. Thanks for all your hard work, and soliciting community input! :)

ayende commented 5 years ago

About plugins. Basically, the only thing that I would like is not being blocked on Assembly.LoadFrom or LoadLibrary calls. I don't need anything else and can do the rest on my own.

vitek-karas commented 5 years ago

@ayende Can you please explain in a bit more detail what do you mean by "being blocked on LoadFrom and such"?

ayende commented 5 years ago

For example, some of the suggestions for this included CoreRT, which meant that we (probably) wouldn't be able to just load a managed dll. But as long as I can provide a path to a managed dll and get an assembly or call LoadLibrary on a native dll, I'm fine with this being the plugin mechanism.

I'm saying this to make it clear that plugin scenarios are not something that should be considered, but rather something that should not be blockced.

ayende commented 5 years ago

I spent some time digging into the code and at first glance, it seems like it should be possible to (speaking about Windows only to make things simple):

I'm going to assume that this is not as simple as that. For example, ICLRRuntimeHost2::ExecuteAssembly doesn't provide any way to give it a buffer, only a file on disk. That make it the first (of what I'm sure will be many) show stopper to actually getting it working.

I'm pretty sure that there is a lot of code that might refer to related stuff as files that may fail, but this is the kind of things that I mean when I say that I want a single file exec and why that kind of a solution needs to be done in the CoreCLR and not externally (as in the zip example).