dotnet / core

.NET news, announcements, release notes, and more!
https://dot.net
MIT License
20.95k stars 4.9k forks source link

Reduce time for serverless cold-start of .Net/C# code #1060

Closed James2397 closed 6 years ago

James2397 commented 6 years ago

Reduce time for serverless cold-start of .Net/C# code

At the moment .Net Core is one of the slowest loading languages you can use for serverless development in Azure Functions of Aws Lambda. Each time you start a new function written in .Net Core there is significant "cold-start" time which means users experience an extended time waiting compared to using other languages. With serverless becoming a core architectural approach for many companies it would make sense to look into this problem now. If the "cold start" of .net core programs could be reduced to low milliseconds it would significantly increase the appeal of the language for serverless development.

The second part of this problem is the artifact that is generated from a dotnet core publish is large compared to other languages which also attributes to slow "cold-start" times. It would be good if the IL Linker could be brought into the normal publishing process to reduce dll sizes (https://github.com/dotnet/core/blob/master/samples/linker-instructions.md). If namespaces or classes are not being used they should be removed from the published dlls. Or combine dlls into a single binary that only contains used classes/methods/properties. Regardless, just make it super small and fast to work the best out of all languages with serverless.

Petermarcu commented 6 years ago

@russellhadley @valenis

Petermarcu commented 6 years ago

Can you share any number you have for cold start times? There are things that can be done to make cold start better and some of it depends on how AWS has built and configured the runtime to work.

James2397 commented 6 years ago

Here is a comparison by a tech blogger and it summarizes what we have been experiencing: http://theburningmonk.com/2017/06/aws-lambda-compare-coldstart-time-with-different-languages-memory-and-code-sizes/

James2397 commented 6 years ago

The below blog article also shows how .Net Core serverless code is significantly slower than all other languages, and the main language it is most similar too - Java - is actually the quickest for cold starts:

https://read.acloud.guru/comparing-aws-lambda-performance-when-using-node-js-java-c-or-python-281bef2c740f Summary: .Net Core was over 5 times slower than Java in those tests for cold start serverless invocation.

Hopefully, this can be looked at soon because it is bad for our customers if we use .Net Core and this happens regularly. Java being the fastest shows that it should be possible for a language like C# to achieve similar positive results.

Petermarcu commented 6 years ago

I noticed that article is using .NET Core 1.0. I know there have been significant investments in both startup and and throughput since. Particularly on Linux. I think it would be useful to do the exercise again with the latest bits to see where things stand. Overall I think this is a good scenario to try to tackle and improve.

On the comment for the size of a published application. How large is the published size of your serverless application?

James2397 commented 6 years ago

A lot of customers are using .Net Core 1.0 (.Net Stardard 1.6) because that is what AWS supports for most features (like Lambda serverless) where cold-start time is critical. No serverless provider currently has support for .Net 2.0 yet. In Azure you can manually hack the configs to get .Net Core 2.0 working but I don't think it would be operating in an optimized way when it's not fully supported in the UI yet. Nobody can do the .Net 2.0 numbers yet because AWS and Azure does not fully support .Net Core 2.0 yet. What I do know from working on a few non-serverless .Net Core 2.0 projects recently though, is the startup time is better than .Net 1.0 but not significantly better - the cold start time is still an issue.

James2397 commented 6 years ago

In regards to the size, in Node we can have some scripts that are a few KB to the do the same job in .Net is 10-20mb because of the tree of dependencies that are pulled in Microsoft .Net Core framework and externally. For example, we may only use 1 tiny enum from a dependency dll and it bings in the whole 5mb dll instead of only the code for the 1 enum that is < 1KB. Our function sizes in .Net Core are on average 100's of times the size of the equivalent in Node because of the 'publish' command bringing in whole dlls, and things we don't need or will ever use - and the code never references. Traditional systems where space was not a concern this is fine, however for serverless where cold-start is most important and every large chunk of unrequired code slows down the loading of the function, it would be good to get dll's cut-down and optimized to be smaller and faster, or removed all together if the code never uses classes/methods/properties from that dll (I am particularly talking about dependencies of dependencies).

Petermarcu commented 6 years ago

On the size thing, we have our linker on the way which will allow the code in your "app" to be trimmed down to only what you need. https://github.com/dotnet/core/blob/master/samples/linker-instructions-advanced.md

We're going to take a deeper look at the serverless startup problem. We know how fast runtime startup is locally. We'll need to determine what other aspects of the serverless scenario are contributing to the coldstart problem.

We definitely want this to be great. Any additional insight is appreciated.

Petermarcu commented 6 years ago

We are having someone reproduce the results so we can dig into more of the details.

kannank commented 6 years ago

Is there any update in the issue. Has there been any noticeable improvements in 2.0

dotnetchris commented 6 years ago

FWIW AWS supports Core 2.0 now in lambdas, added support early 2018.

jansabbe commented 6 years ago

I ran the benchmarks from theburningmonk again using .net core 2.0 (and adding golang as well). See my results here: https://plot.ly/%7Ejansabbe/6/csharp-java-python-nodejs6-golang/ .Net core 2.0 is faster than java, but still way slower than nodejs/python or Go.

Interestingly, filesize doesn't matter that much. The .net core 2.0 deployment package was around 200K, while the golang deployment package was close to 3MB. Go was still an order of a magnitude faster for simply responding "hello". Not sure if this is a .net issue, seems more like something is weird at AWS.

Petermarcu commented 6 years ago

Just saw this article from yesterday showing .NET Core 2.0 as the top dog https://read.acloud.guru/comparing-aws-lambda-performance-of-node-js-python-java-c-and-go-29c1163c2581 .

asabla commented 6 years ago

Sadly @Petermarcu they're excluding cold startup time

Similar to the original performance tests, we’ll ignore the initial cold start time — and focus only on the duration metric to compare runtime performance between the different languages

James2397 commented 6 years ago

@Petermarcu the article states a few sentences in "we’ll ignore the initial cold start time — and focus only on the duration metric to compare runtime". There was no questions about the run-time performance of .Net, which we know is good - what was, was the lengthy cold-start duration that is affecting a lot of web platforms that want to scale up serverless fast. In serverless, sometimes you will get a request that takes 90ms, then another time because it is going through a cold start it takes multiples of that (plus the time the host times to start the new VM). The other blog posts I listed demonstrated the slower cold-start time of .net core.

James2397 commented 6 years ago

@asabla you beat me to it ;)

Petermarcu commented 6 years ago

Understood. Thanks for pointing that out. Just to clarify, the "other article" is this one: http://theburningmonk.com/2017/06/aws-lambda-compare-coldstart-time-with-different-languages-memory-and-code-sizes/ right?

In general, I would expect that .NET and Java would be in the same ballpark. We did make improvements to coldstart in .NET Core 2.0 and have more in 2.1 and have more ideas for beyond that. I'd be interested in seeing how things are looking today with .NET Core 2.0 vs Java.

There are strategies that can be employed by the infrastructure whether its Azure or AWS to keep the process warm. You can see this in the efforts and variations the author had to go through to ensure they were getting cold start time.

leecow commented 6 years ago

Lots of performance work went into 2.1 and we're not resting on our laurels. Closing this conversation as it has gone quiet. Definitely open another if needed.

James2397 commented 6 years ago

Could we get a comparison between the different .Net Core versions to see if their is an improvement instead of just closing the issue? This issue is still a concern to a lot of people.

neilgallagher commented 6 years ago

Second that. Would love to see a comparison on startup times especially between 2.0 and 2.1. Also are there any articles on what the cold start improvements are in 2.1? Thanks

jansabbe commented 6 years ago

I ran the cold-start benchmark from theburningmonk again for 2.0 and 2.1.

There seems to be an improvement in the case of 512mb, but other than that 2.0 and 2.1 are pretty similar for cold starts. Both versions are faster than Java. Still a long road ahead to compete with NodeJS, Go or Python 😉 No idea if it is possible to deploy crossgenned, ReadyToRun images to AWS. Might help for cold-starts.

Gorthog commented 6 years ago

Please reopen this issue. Cold start times just cost us a customer that went with Node instead, and honestly I completely understand his choice. The difference in cold start is x100 with .Net Core 2.1 vs Node/Python/Go

nvcken commented 5 years ago

@sinapis have you try this https://medium.com/@zaccharles/making-net-aws-lambda-functions-start-10x-faster-using-lambdanative-8e53d6f12c9c