Closed morganbr closed 4 years ago
Out of interest, how does this initiative compare to CoreRT? They seem like similar efforts?
Is it related to 'possibly native user code', i.e. it this will still allow code to be JIT-compiled, not just AOT?
Also, I assume that the runtime components ('Native code (runtime, host, native portions of the framework..') will be the ones from the CoreCLR repo?
You're asking great questions, but since this is still early in design, I don't have great answers yet.
Out of interest, how does this initiative compare to CoreRT? They seem like similar efforts?
There would likely be somewhat similar outcomes (a single file), but the design may have different performance characteristics or features that do/don't work. For example, a possible design could be to essentially concatenate all of the files in a .NET Core self-contained application into a single file. That's 10s of MB and might start more slowly, but on the other hand, it would allow the full capabilities of CoreCLR, including loading plugins, reflection emit and advanced diagnostics. CoreRT could be considered the other end of the spectrum -- it's single-digit MB and has a very fast startup time, but by not having a JIT, it can't load plugins or use reflection emit and build time is slower than most .NET devs are used to. It currently has a few other limitations that could get better over time, but might not be better by .NET Core 3.0 (possibly requiring annotations for reflection, missing some interop scenarios, limited diagnostics on Linux). There are also ideas somewhere between the two. If folks have tradeoffs they'd like to make/avoid, we'd be curious to hear about them.
Is it related to 'possibly native user code', i.e. it this will still allow code to be JIT-compiled, not just AOT?
By "native user code," I meant that your app might have some C++ native code (either written by you or a 3rd-party component). There might be limits on what we can do with that code -- if it's compiled into a .dll, the only way to run it is off of disk; if it's a .lib, it might be possible to link it in, but that brings in other complications.
Also, I assume that the runtime components ('Native code (runtime, host, native portions of the framework..') will be the ones from the CoreCLR repo?
Based on everything above, we'll figure out which repos are involved. "Native portions of the framework" would include CoreFX native files like ClrCompression and the Unix PAL.
A single file distribution in this manner, even if has slightly slower startup time, can be invaluable for ease of deployment. I would much rather have the ability to have the full power than be forced to give up some of that.
Some scenarios that are of interest to us. How would this work in terms of cross platform? I assume we'll have a separate "file" per platform?
With regards to native code, how would I be able to choose different native components based on the platform?
Some scenarios that are of interest to us. How would this work in terms of cross platform? I assume we'll have a separate "file" per platform? With regards to native code, how would I be able to choose different native components based on the platform?
@ayende, I'm quoting from @morganbr comment:
a possible design could be to essentially concatenate all of the files in a .NET Core self-contained application into a single file.
The current cross-platform story for self-contained applications is creating a deployment package per platform that you'd like to target, because you ship the application with the runtime, which is a platform-specific.
@morganbr I appreciate you taking to time to provide such a detailed answer
I'll be interested to see where the design goes, this is a really interesting initiative
I have a few questions for folks who'd like to use single-file. Your answers will help us narrow our options:
@morganbr, do you think that these questions are better asked to a broader audience; i.e., broader that people who know about this GitHub issue?
For example, a possible design could be to essentially concatenate all of the files in a .NET Core self-contained application into a single file.
Looking at compressing it; or using a compressed file system in the file?
@tpetrina, thanks! Point 3 covers a couple of design angles:
@TheBlueSky, we've contacted other folks as well, but it helps to get input from the passionate folks in the GitHub community.
@benaadams, compression is on the table, but I'm currently thinking of it as orthogonal to the overall design. Light experimentation suggests zipping may get about 50% size reduction at the cost of several seconds of startup time (and build time). To me, that's a radical enough trade-off that if we do it, it should be optional.
@morganbr several seconds of startup time when using compression? I find that hard to believe when considering that UPX claims decompression speeds of
~10 MB/sec on an ancient Pentium 133, ~200 MB/sec on an Athlon XP 2000+.
@morganbr, for me the answers are:
1) Service (console app running Kestrel, basically). Running as Windows Service / Linux Daemon or in docker.
2) Yes
3) Yes, typically managed assemblies using AssemblyContext.LoadFrom
. These are provided by the end user.
4) Yes, that is expected. In fact, we already bundle the entire framework anyway, so no change from that perspective.
5) As a service, we don't care that much for the startup time. 5 seconds would be reasonable.
6) 75MB is probably the limit. A lot depends on the actual compressed size, since all packages are delivered compressed.
7) For release builds, longer (even much longer) build times are acceptable.
8) Yes, absolutely. Size doesn't matter that much, but smaller is better.
Something that I didn't see mentioned and is very important is the debuggability of this.
I hope that this isn't going to mangle stack traces, and we would want to be able to include pdb
files or some sort of debugging symbols.
About compression, take into account the fact that in nearly all cases, the actual delivery mechanism is already compressed. For example, nuget packages. Users are also pretty well versed in unzipping things, so that isn't much of an issue. I think you can do compression on the side.
Thanks, @ayende! You're right that I should have called out debuggability. I think there are only a few minor ways debugging could be affected:
When you say "include pdb files", do you want those inside the single file or just the ability to generate them and hang onto them in case you need to debug the single-file build?
1) Not an issue for us. E&C is not relevant here since this is likely to be only used for actual deployment, not day to day. 2) Ideally, we have a single file for everything, including the PDBs, not one file and a set of pdbs on the side. There is already the embedded PDB option, if that would work, it would be great. 3) When talking about debug, I'm talking more about production time rather than attaching a debugger live. More specifically, stack trace information including file & line numbers, being able to resolve symbols when reading dump, etc.
Another question for us is whether you'd be able to do this for individual components too (perhaps even staged)? E.g. we have library dlls that use lots of dependencies. If we could package those it would save a lot of pain of version management etc. If these in turn could be packaged into an exe that would be even nicer?
@tpetrina @ayende @bencyoung @Kosyne @expcat you responded yes to question 3 ("Would your app load plugins or other external dlls that you didn't originally include in your app build?") - can you tell us more about your use case?
The main selling point of a single file distribution is that there is only one file to distribute. If your app has plugins in separate files, what value would you be getting from a single file distribution that has multiple files anyway? Why is "app.exe+plugin1.dll+plugin2.dll" better than "app.exe+coreclr.dll+clrjit.dll+...+plugin1.dll+plugin2.dll"?
app.exe
+ 300+ dlls - which is the current state today is really awkward.
app.exe
+ 1-5 dlls which are usually defined by the user themselves is much easier.
Our scenario is that we allow certain extensions by the user, so we would typically only deploy a single exe
and the user may add additional functionality as needed.
It isn't so much that we plan to do that, but we want to be able to do that if the need arise.
@ayende Agreed, same with us.
Also, if we could so this at the dll level then we could package dependencies inside our assemblies so they didn't conflict with client assemblies. I.e. by choosing a version of NewtonSoft.Json you are currently defining it for all programs, plugins and third-party assemblies in the same folder, but if you could embed it then third-parties have flexibility and increase version compatibility
Agree with @ayende .
Thanks, everyone for your answers! Based on the number of folks who will either use native code or need to load plugins, we think the most compatible approach we can manage is the right place to start. To do that, we'll go with a "pack and extract" approach.
This will be tooling that essentially embeds all of the application and .NET's files as resources into an extractor executable. When the executable runs, it will extract all of those files into a temporary directory and then run as though the app were published as a non-single file application. It won't start out with compression, but we could potentially add it in the future if warranted.
The trickiest detail of this plan is where to extract files to. We need to account for several scenarios:
I think we can account for all of those by constructing a path that incorporates:
Together, that might look something like c:\users\username\AppData\Local\dotnetApps\elevated\MyCoolApp\1.0.0.0_abc123\MyCoolApp.dll (Where the app is named MyCoolApp, its version number is 1.0.0.0 and its hash/guid is abc123 and it was launched elevated).
There will also be work required to embed files into the extractor. On Windows, we can simply use native resources, but Linux and Mac may need custom work.
Finally, this may also need adjustments in the host (to find extracted files) and diagnostics (to find the DAC or other files).
CC @swaroop-sridhar @jeffschwMSFT @vitek-karas
I feel like this cure is worse than the disease. If we have to deal with external directories (different across OS's), updating, uninstalling and the like, that flies in the face of my reason for desiring this feature in the first place (keeping everything simple, portable, self contained and clean).
If it absolutely has to be this way, for my project, I'd much prefer a single main executable and the unpacked files to live in a directory alongside that executable, or possibly the ability to decide where that directory goes.
That's just me though, I'm curious to hear from others as well.
I have to agree here, using a different directory can have many exciting problems - e.g. you place a config file alongside the exe and this exe is not picked up because the "real" directory is somewhere else. Disk space could be a problem to, also random file locks due to access policies etc. pp. I would like to use this feature, but not if it adds a host of faile modes which are impossible to detect before.
Agreed with @Kosyne - The proposed initial solution seems to simply automate an "installer" of sorts. If that was the limit of the problem we're trying to solve with a single exec then I think we'd have all simply performed that automation ourselves.
The key goal of the single exec proposal should be to be able to run an executable on an unmanaged system. Who knows if it even has write access to any chosen destination "install" directory? It should certainly not leave artefacts of itself after launch either (not by default).
As a small modification to the existing proposal to satisfy the above: Could we not unpack into memory and run from there?
Agree with the rest of the comments. Unzipping to another location is something that is already available. We can have a self extracting zip which will run the extracted values fairly easily. That doesn't answer a lot of the concerns that this is meant to answer and is just another name for installation.
The location of the file is important. For example, in our case, that would mean:
One of our users need to run our software from a DVD, how does that works, on a system that may not actually have a HD to run on.
I agree that it would be better to do everything in memory. And the concern about the startup time isn't that big, I would be fine paying this for every restart, or manually doing a step to alleviate that if needed.
Another issue here is the actual size. If this is just (effectively) an installer, that means that we are talking about file sizes for a reasonable app in the 100s of MB, no?
It seems that building the proposed solution does not require (many if any) CLR changes. Users can already build a solution like that. There is no point in adding this to CoreCLR. Especially, since the use case for this is fairly narrow and specific.
@GSPP This seems like basically something that I can do today with 7z-Extra
, I agree that if this is the case, it would be better to not have it at all.
Sorry I'm late to this party, I got here after following a link posted in a duplicate ticket that I was tracking. After reading the latest comments here, I'm sorry to see that you're considering packing and extracting. This seems like overkill, why not start with the ability to deploy SFAs for basic console apps? It seems to me that it should be possible to create a rudimentary console app with some network, IO, and some external dependencies (nuget, etc) that sits in a single file. I guess what I'm trying to say is that instead of gathering the requirements of everyone, gather the requirements of no one and instead start small with a first iteration that makes sense for everyone and yields results quickly.
This seems like basically something that I can do today with 7z-Extra
You are right that number of programming environment agnostic solutions to address this problem exist today. Another example out of many: https://www.boxedapp.com/exe_bundle.html
The added value here would be integration into the dotnet tooling so that even non-expert users can do it easily. Expert users can do this today by stitching existing tools together as you have pointed out.
Personally, I agree with you that it is not clear that we are making the right choice here. We had a lot of discussion about this within the core team.
Putting an intern on it and coming out with a global tool (that is recommended, but not supported) would do just as well, and can be fairly easy to install as well.
Effectively, we are talking about dotnet publish-single-file
and that would do everything required behind the scene.
I don't see anything that is actually required by the framework or the runtime to support this scenario, and by making this something that is explicitly outside the framework you are going to allow users a lot more freedom about how to modify this. No "need" to get a PR (with all the associated ceremony, backward compact, security, etc) that you would get if you want to make a change to the framework. You just fork a common sideline project and use that.
Note that as much as I would like this feature, I would rather not have something in (which means that it is always going to be in) that can be done just as well from the outside.
I want to ask a higher level question: What is the main motivation for customers desiring single-file distribution? Is it primarily:
I'm a bit concerned that this feature is being read by some as not so important, or at least might not be a good goal to include in the core featureset here. I'd like to just reiterate that I think it would be immensely powerful for any application that acts as a service or sub-module of a larger application. One that you, the author, may not even be the developer of. I don't want to have to pass on dependency requirements that can only be resolved with installations (auto or otherwise), or post-execution artefacts on disk that might need additional security elevation by the user, etc. Right now, .NET just isn't a good choice for this niche problem. But it could be (And should be IMO).
The compiled executable must:
Re @swaroop-sridhar I daresay every application will have its own unique order of performance needs and so I'd imagine the best approach after tackling the core solution is to pick the low hanging fruit and go from there.
@swaroop-sridhar For me, this is about packaging and ease of use for the end user. No need to deal with installation of system wide stuff, just click and run.
This is important because we allow our software to be embedded, and a single file addon is a lot easier to manage.
@strich The point about embedding is a good one. We are commonly used as a component in a micro service architecture, and reducing the deployment overhead will make that easier.
The problem isn't whatever or not this is an important feature. The issue is whatever the proposed solution (essentially zipping things, at this point) is required to be in the core. Stuff that is in the core has a much higher standard for changes. Having this as an external tool would be better, because that is easier to modify and extend.
To be honest, I would much rather see a better option altogether. For example, it is possible to load dlls from memory, instead of files (require some work, but possible). If that happens, you can run the entire thing purely from memory, with unpacking being done to pure memory and no disk hits.
That is something that should go in the core, because it will very likely require modifications to the runtime to enable that. And that would be valuable in and off itself. And not something that we can currently do.
So a good example is to look at the experience of Go tools like Hashicorp's Consul. A single exe file that you can drop onto any machine an run. No installers, no copying folders around, no hunting for config files in lists of hundreds of files, just a really nice end user experience.
For us I'm not sure the in-memory approach would work as we'd also like this to work for plugin dlls as well (so dropping a plugin would also be a single file rather than all it's dependencies), but any progress would be good. We've looked at Fody.Costura and that works well for some stuff but we've had issues with .NET Core for that.
So a good example is to look at the experience of Go tools like Hashicorp's Consul
Eh, for tools like consul the ideal solution would be corert, not this self-extract improvisation.
@mikedn Why's that? I don't see what AOT or JIT compilation has to do with deployment method?
I want to second @strich's words, single file deployments would be a breath of fresh air for microservice architecture deployments, as well as for any console app that is — or at least starts its life as — a small tool with with command line switches.
Why's that? I don't see what AOT or JIT compilation has to do with deployment method?
Because it gives you exactly what you want - a single exe file that you can drop onto any machine (well, any machine having a suitable OS, it's a native file after all) and run. It also tends to use less resources, which for agents like consul is a good thing. It is pretty much the equivalent of what Go gives you, more than a self extract solution.
@mikedn I guess, but
1) It doesn't really exist yet in a production form (as far as I know)! 2) We use a lot of dynamic features (IL generation, arbitrary reflection) 3) We still want to be able to add plugins (again ideally compacted)
See as this issue was about asking people what they want, we're only giving our opinion! We don't really want to have to switch to a different runtime model just to get this benefit. To me they're orthogonal concerns
I don't see what AOT or JIT compilation has to do with deployment method?
Without a JIT, it's easier to get things like the debugging story good enough. The JIT part makes the problem harder which is why you won't find it in Go. This is engineering, so you either throw more engineers at the harder problem and live with the new complexity, or scope it down elsewhere. The self-extractor is about scoping it down because the number of engineers with the necessary skills is limited.
People with projects that are more like Go projects (no JIT requirements) might be pretty happy with CoreRT, if they're fine with the "experimental" label on it. It's pretty easy to try these days. It uses the same garbage collector, code generator, CoreFX, and most of CoreLib as the full CoreCLR, and produces small-ish executables (single digit megabytes) that are self-contained.
It doesn't really exist yet in a production form (as far as I know)!
Yes, my comment was mostly targeted at MS people :grin:. They have all these parallel, related, on-going projects/ideas (corert, illinker) and now they add one more, this self extract thing that, as many already pointed out, it's a bit of a "meh, we can do that ourselves" kind of thing. And it comes with downsides as well, such as extracting files to a "hidden" directory.
We use a lot of dynamic features (IL generation, arbitrary reflection)
That's something that the community as a whole might want to give a second thought. Sure it's useful to be able to do that but it kind of conflicts with other desires like single file deployment. You can still get single file deployment but that tends to come at a cost - a rather large file. If you're OK with that then that's perfectly fine. But in my experience the larger the file the gets the less useful the "single file" aspect becomes.
@MichalStrehovsky sure, there are different options. However for us we can't use experimental (convincing it was time to move to .NET Core was hard enough) and I don't think extracting to a temp folder will work in our case either. However worst case is we carry on as we are and don't use this feature.
It is something we would like though if it went the way we'd want it to :)
@mikedn I agree. Multiple parallel solutions are even more confusing. I think our ideal solution would be some kind of super ILLinker/weaver approach but I'm happy to let this play out and see where we end up.
I'm really really excited about this functionality landing, but TBH I'm equally unexcited about the initial proposal that you posted @morganbr. My answers to the list of questions you posed is similar to what others posted (so I think there is a common desired set of capabilities), but IMHO the proposed 'unpack to disk' solution is not at all what I'd hope to see implemented and as others said would almost be worse than the 'disease'. @jkotas
I agree with @strich and @ayende The compiled executable must: Package all relevant dependencies Have no execution artifacts (No disk writes)
Loading .dlls from memory instead of disk may not be easy, but that's the kind of capabilities that IMO would be worth the deep expertise of MSFT low-level devs (and then leveraging in CoreClr) vs the above proposal which could just be implemented as an external tool (and already has, see https://github.com/dgiagio/warp). If this was achieved, I wonder how much time difference there would be between first and subsequent runs? For inspiration/example, I think Linux 3.17+ has memfd_create which can be used by dlopen.
On another topic, I'm wondering if the requirement to support plugins is over-indexing the design proposal? Would it be worth making this functionality opt-in, so that only the people that need this capability incur the potential penalties (lack of tree shaking etc) and everyone else (significant majority?) would get reduced deployable size, perf benefits(?)
Stepping back @swaroop-sridhar @MichalStrehovsky , I can see two broad use cases that might have different-enough goals/desires to make it hard to accomodate everyone with one solution:
I hope this braindump makes some sense, I'm not trying to be a jerk, but just provide feedback because I'm very interested in this topic. Thanks for all your hard work, and soliciting community input! :)
About plugins.
Basically, the only thing that I would like is not being blocked on Assembly.LoadFrom
or LoadLibrary
calls.
I don't need anything else and can do the rest on my own.
@ayende Can you please explain in a bit more detail what do you mean by "being blocked on LoadFrom and such"?
For example, some of the suggestions for this included CoreRT, which meant that we (probably) wouldn't be able to just load a managed dll.
But as long as I can provide a path to a managed dll and get an assembly or call LoadLibrary
on a native dll, I'm fine with this being the plugin mechanism.
I'm saying this to make it clear that plugin scenarios are not something that should be considered, but rather something that should not be blockced.
I spent some time digging into the code and at first glance, it seems like it should be possible to (speaking about Windows only to make things simple):
MemoryModule
(https://github.com/fancycode/MemoryModule) to load them into memoryI'm going to assume that this is not as simple as that. For example, ICLRRuntimeHost2::ExecuteAssembly
doesn't provide any way to give it a buffer, only a file on disk.
That make it the first (of what I'm sure will be many) show stopper to actually getting it working.
I'm pretty sure that there is a lot of code that might refer to related stuff as files that may fail, but this is the kind of things that I mean when I say that I want a single file exec and why that kind of a solution needs to be done in the CoreCLR and not externally (as in the zip example).
This issue tracks progress on the .NET Core 3.0 single-file distribution feature. Here's the design doc and staging plan for the feature.